How I Designed a Scalable Music Streaming Platform Without Burning Cloud Costs

Junaid Rashid
MVP Developer
9 min read
May 6, 2026

The Problem
A client came to me with an idea that he wants to build a music streaming app for music fans. The app needed to play both audio and video. It had to run on iPhone, Android, and the web. It had to support free users (with ads) and paid users (with offline downloads and no ads). And it had to launch in a few months. That was the easy part. The hard part was the budget.
What the Client Wanted
- A working app in 3 to 6 months, ready for 20k+ users at launch.
- Growth to 100k+ users within two years.
- Monthly cloud cost between $300 and $500.
- Fast playback: A song should start in under one second.
- Smooth experience on slow internet.
- Built to grow later: AI recommendations, social features, ads, and DRM in phase two.
CLIENT REVIEW & SUCCESS STORY
Watch how we helped our client transform their vision into a scalable reality.

He was also clear about what he did not want: • No Kubernetes. • No microservices. • No expensive AWS services like MediaConvert or Kinesis. • No heavy analytics tools on day one. These were not random rules. A small team can run a simple system at 2 a.m. when something breaks.
The Core Idea Behind My Design
Before I picked any technology, I made one decision that shaped everything else: The backend should never touch the actual music or video files.
Music and video are big. If the backend had to stream video to users, the cloud bill would blow past $500 in the first week. So I designed the system in two halves:
- The backend handles logins, playlists, payments, search, and JSON responses.
- Cloudflare handles all the music and video files, playback, storage, and CDN delivery.
When a user wants to play a song, the backend does not send the song. It sends a signed link to Cloudflare, and Cloudflare streams the song directly to the user. The backend is out of the way in milliseconds. This one decision is why the whole thing fits inside $500 a month.
The Architecture
Here is the full picture, top to bottom:
Users
iPhone, Android, web apps, and an admin panel for the client's internal team.
Cloudflare (Media Layer)
- R2 stores all music and video files.
- Cloudflare Stream prepares videos for adaptive playback.
- Cloudflare CDN delivers files from edge servers close to users.
AWS (Backend Layer)
- AWS WAF blocks malicious traffic.
- Load Balancer distributes requests.
- ECS with Fargate runs backend containers.
- RDS PostgreSQL stores application data.
- Redis caches hot data.
- Elasticsearch powers search.
- Secrets Manager stores credentials securely.
- CloudWatch handles logs and monitoring.
Outside Services
- Stripe for web and Android payments.
- Apple In-App Purchases for iOS.
- Firebase for push notifications.
- AWS SES for transactional emails.
What I Built, Module by Module
The backend is one application, but internally it is divided into clean modules. Each module has a single responsibility and can later evolve into its own service if needed.
- Auth & Users — Sign-up, login, JWT auth, refresh sessions with Redis.
- Subscription & Billing — Stripe and Apple billing integration with grace periods.
- Media — Stores metadata and generates signed Cloudflare streaming links.
- Playlists — Public and private playlists with ranking support.
- Playback — Tracks listening sessions and playback permissions.
- Offline Downloads — Secure encrypted downloads for premium users.
- Admin — Artist management, uploads, moderation, and admin permissions.
- Notifications — Push notifications and email delivery.
- Analytics — Tracks play counts and artist statistics.
- Search — Fast song, playlist, and artist search.
Why I Chose Each Technology
NestJS (Backend Framework)
Why: Built for clean modular TypeScript applications. Excellent for small teams and long-term maintainability. Why not Express: Too barebones. Why not Spring Boot: Too heavy. Why not Django: Different language ecosystem.
PostgreSQL (Main Database)
Why: Reliable relational database with support for JSON fields and full-text search. Why not MongoDB or DynamoDB: The data model is relational — users, playlists, artists, and songs are deeply connected.
Redis (Caching)
Why: Stores sessions, rate limits, and hot song data in memory to reduce database load. Why not skip it: 20k+ users would overwhelm the database without caching.
Cloudflare (Media + CDN)
Why: Lower bandwidth costs, excellent edge delivery in Africa and developing regions, and adaptive streaming support. Why not AWS CloudFront: Egress costs would exceed the project's budget quickly.
AWS ECS with Fargate
Why: Runs containers without managing servers. Easy autoscaling and low operational overhead. Why not EC2: Manual infrastructure management. Why not Kubernetes: Too much complexity for the team's size.
Elasticsearch
Why: Millisecond-level search once the catalog grows large. Strategy: Start with PostgreSQL full-text search and move to Elasticsearch later when scale requires it.
GitHub Actions
Why: Free, simple CI/CD pipeline integrated directly with the repository. Automatically tests, builds, and deploys backend containers to ECS.
The Two Tricky Problems I Solved
Problem 1: Offline Downloads That Cannot Be Stolen
Paid users can download songs for offline listening. The challenge was preventing users from copying and sharing raw files.
My solution: Encryption with split keys.
When a user downloads a song, the server generates a random encryption key. The server stores only the identifier, not the actual key. The real key is securely stored on the user's device using Keychain on iPhone or Keystore on Android. The song is encrypted locally and can only be decrypted using that device-specific key.
The clever part: Only one device can have active downloads at a time. Logging in on another device invalidates previous keys. This delivered DRM-like protection without paying for enterprise DRM systems.
Problem 2: Counting Plays Without Killing the Database
Play counts sound simple, but writing every tap directly to the database creates huge load and inaccurate analytics.
My solution: A 30-second validation rule plus cached aggregates.
A play only counts after 30 seconds of listening. Every valid play is stored in a lightweight tracking table.
For dashboards, pre-calculated totals are cached separately (weekly plays, monthly plays, etc.). Artists usually read cached aggregates instead of raw analytics queries. Today's analytics are intentionally excluded to avoid incomplete data.
The Outcome
- Monthly launch cost stayed around $350.
- Playback starts in under one second.
- 20k+ users handled comfortably using two backend containers and one database.
- Scales cleanly to 100k+ users with read replicas and additional containers.
- Future AI recommendations, ads, social features, and DRM can be added without a rewrite.
- A single engineer can operate the system without Kubernetes or microservices.
When you have a small budget and big ambitions, the architect's job is not to show off with fancy tools. It is to make a few simple pieces do a lot of work. The real skill is saying no to complexity until the product actually needs it. Boring, simple, and well-chosen beats clever every time.
— Junaid Rashid