- Published on
Scaling a System from Zero
Scaling a System from Zero
Optimizing cost is the one of hardest problem in the system design, as it requires balancing cost and performance. Scaling a system from scratch to milions of users requires a clear roadmap for it. In this post, I will explain how to scale a system from the MVP stage to a system capable of serving milions of users.
Summarize
A scalable system envolve through serveral stages. Each stage comes with different trade-off, so you should choose the approach that best fits your business need.
- Deploy your entire system on a single VPS.
- Split your system into separate frontend and backend server.
- Cluster and Loadbalancer
- Separate database to be Master-Slave
- Caching system
- Elasticsearch
1. Deploy your entire system on a single VPS
At this level, you can place your code into one VPS server. Start your backend, frontend, database,... into it. The purpose of this level is keep the system architecture as a simple as possible. Development speed is the most important factor at this stage, so you can skip the complex system structure.
Using a full-stack framework such as Next.js or Remix is a good option at this stage. The important thing is implemnt your MVP and test the market.
Cost mostly 0 in case deploy by your old machine and tunnel it to the internet by Cloudflare.
2. Split your system into separate frontend and backend server
Split your system to be FE and BE server seperately. This way improve the load for your system. One stand for UI, caching component, optimize static content, blogs, query caching, .... Another is application logic, be responsible for business logic and data consistency and this is the source of truth.
Increasing cost when comparing to the 'All-in-one' level but it increase the load and serve more user.
3. Cluster and Loadbalancer
Run multiple instances of your application. Add a load balancer to route incoming requests based on the load of vps. By this way will divide the request to be multiple cluster. The load balancer ensures that no server becom overloaded while other remain idle.
Loadbalancer have some algorithm to determine the request to be belong to which cluster (based on number of request, bandwidth, capacity, hash, less time,..).
4. Separate database to be Master-Slave
Master-Slave pattern allow Read and Write data separately. Some application need write more than read and vice versa. So spend 50/50 for read and write could be not effective.
By this problem, divide database to be master (write) and slave (read) could be help.
This approach is conceptually similar to the CQRS pattern, as it separates read and write workloads. Sys admin can scale read or write based on their business. However, read replicas introduce replication lag, which means read data may not always be real-time. This is the main trade-off of this approach.
5. Caching system
This level significantly improves query performance. Each query result can be after the first request, so this will serve a thousand of request quickly by reading from cache instead of recomputing the same data repeatedly.
Normally using Redis, KeyDB, Memcache,... These cache fast because of its mechanisms to read from Ram instead of Disk. This lead to more Ram consuming and you need to control the cache when it large.
Plan to invalidate cache and manage their TTL (time to live) effective make the app more healthy.
6. Elasticsearch
In e-commerce system, search is a primary feature that help user find their product. Elasticsearch uses an inverted index, which makes full-text search extremely fast even for large datasets.
Elasticsearch designed for the distributed system, sharding, replica system.