HomeInterview QuestionsSystem Design Interview QuestionsTop 10
🏗️
Free Study Guide · 2025

Top 10 System DesignInterview Questions & Answers (2025)

System design interviews are the deciding round for SDE-2 and above roles at every product company. These questions on load balancing, caching, databases, and microservices are asked at Flipkart, Amazon, Google, Razorpay, and CRED for senior engineering positions.

10 questions
Detailed answers
100% free
Also available:All 16 questions →
1How would you design a URL shortener like bit.ly?
Key components: URL generation (base62 encode a unique ID or hash the URL), storage (a simple key-value store like Redis for hot URLs + MySQL for persistence), redirection (HTTP 301 for permanent, 302 for tracking), analytics (count clicks async via a message queue). Scale: use consistent hashing to distribute ID generation, CDN for edge redirects, DB read replicas. Handle collision in hashing with random suffix or retry.
2What is the difference between horizontal and vertical scaling?
Vertical scaling (scaling up) means adding more resources to a single server (more CPU, RAM). It's simple but has limits and is a single point of failure. Horizontal scaling (scaling out) means adding more servers to distribute load. It requires a load balancer, stateless services (or shared session storage), and distributed data. Most modern architectures prefer horizontal scaling for high availability and cost efficiency.
3What is a load balancer and what algorithms does it use?
A load balancer distributes incoming traffic across multiple backend servers for high availability and scalability. Algorithms: Round Robin (cyclic distribution), Weighted Round Robin (based on server capacity), Least Connections (route to server with fewest active connections), IP Hash (consistent routing for a client to the same server — useful for session affinity), and Random. L4 (TCP) vs L7 (HTTP-aware) load balancers differ in what they can inspect.
4What is caching and what are common caching strategies?
Caching stores frequently-accessed data in fast storage (Redis, Memcached) to reduce database load and latency. Strategies: Cache-aside (app checks cache, falls back to DB, writes to cache — most common), Write-through (write to cache and DB synchronously), Write-behind (write to cache, async to DB), Read-through (cache sits in front of DB). Cache invalidation (when to evict stale data) is the hardest problem.
5What is the CAP theorem?
CAP theorem states that a distributed system can guarantee at most two of three properties: Consistency (every read returns the most recent write), Availability (every request gets a response, even if not the latest data), and Partition Tolerance (system continues despite network partitions). Since partitions are inevitable in distributed systems, you choose between CP (consistent but may be unavailable during partitions — e.g., HBase) and AP (always available but may return stale data — e.g., DynamoDB, Cassandra).
6What is the difference between SQL and NoSQL databases and when do you choose each?
SQL (PostgreSQL, MySQL) enforces a schema, supports complex joins and ACID transactions — ideal when data relationships are complex and consistency is critical (banking, e-commerce orders). NoSQL: Document stores (MongoDB) for flexible schemas; Key-value (Redis) for caching and sessions; Wide-column (Cassandra) for write-heavy time-series at massive scale; Graph (Neo4j) for relationship-heavy queries. Most large systems use both (polyglot persistence).
7What is database replication and sharding?
Replication copies data to multiple servers: a primary handles writes; read replicas handle reads (eventual consistency). This improves read throughput and availability. Sharding (horizontal partitioning) splits data across multiple databases based on a shard key (user_id % n). Sharding improves write throughput and storage but complicates cross-shard queries and transactions. Replication is for availability; sharding is for scale.
8What is a message queue and when would you use one?
A message queue (Kafka, RabbitMQ, SQS) decouples producers from consumers, enabling async processing. Use cases: smooth traffic spikes by buffering requests, background jobs (email sending, report generation), event-driven microservices communication, guaranteed delivery with retries. Kafka specifically is used for high-throughput event streaming and maintaining an ordered log of events.
9What is a CDN and how does it work?
A CDN (Content Delivery Network) is a globally distributed network of edge servers that cache static assets (images, JS, CSS) close to users, reducing latency. When a user requests a file, the CDN serves it from the nearest edge node. On a cache miss, the edge fetches from the origin server and caches the response. CDNs also absorb DDoS traffic and offload origin server bandwidth.
10What is microservices architecture vs monolith?
A monolith is a single deployable unit containing all functionality — simpler to develop, test, and debug but hard to scale and deploy individual parts. Microservices decompose the system into small, independently deployable services each owning its data. Benefits: independent scaling, independent deployment, technology flexibility. Challenges: distributed system complexity (network calls, distributed transactions, service discovery, observability).
See all 16 System Design questions →
Level up your prep
Get company-specific System Design questions
Upload your resume → get questions tailored to Google, Amazon, TCS, and 50+ companies.
Try AI Interview Prep →
© 2025 CareerLens · Home · Interview Questions · Pricing