HomeInterview QuestionsSystem Design Interview Questions
🏗️
Free Study Guide · 2025

Top 40 System Design Interview Questions & Answers (2025)

System design interviews are the deciding round for SDE-2 and above roles at every product company. These questions on load balancing, caching, databases, and microservices are asked at Flipkart, Amazon, Google, Razorpay, and CRED for senior engineering positions.

16 questions
Detailed answers
100% free
1How would you design a URL shortener like bit.ly?
Key components: URL generation (base62 encode a unique ID or hash the URL), storage (a simple key-value store like Redis for hot URLs + MySQL for persistence), redirection (HTTP 301 for permanent, 302 for tracking), analytics (count clicks async via a message queue). Scale: use consistent hashing to distribute ID generation, CDN for edge redirects, DB read replicas. Handle collision in hashing with random suffix or retry.
2What is the difference between horizontal and vertical scaling?
Vertical scaling (scaling up) means adding more resources to a single server (more CPU, RAM). It's simple but has limits and is a single point of failure. Horizontal scaling (scaling out) means adding more servers to distribute load. It requires a load balancer, stateless services (or shared session storage), and distributed data. Most modern architectures prefer horizontal scaling for high availability and cost efficiency.
3What is a load balancer and what algorithms does it use?
A load balancer distributes incoming traffic across multiple backend servers for high availability and scalability. Algorithms: Round Robin (cyclic distribution), Weighted Round Robin (based on server capacity), Least Connections (route to server with fewest active connections), IP Hash (consistent routing for a client to the same server — useful for session affinity), and Random. L4 (TCP) vs L7 (HTTP-aware) load balancers differ in what they can inspect.
4What is caching and what are common caching strategies?
Caching stores frequently-accessed data in fast storage (Redis, Memcached) to reduce database load and latency. Strategies: Cache-aside (app checks cache, falls back to DB, writes to cache — most common), Write-through (write to cache and DB synchronously), Write-behind (write to cache, async to DB), Read-through (cache sits in front of DB). Cache invalidation (when to evict stale data) is the hardest problem.
5What is the CAP theorem?
CAP theorem states that a distributed system can guarantee at most two of three properties: Consistency (every read returns the most recent write), Availability (every request gets a response, even if not the latest data), and Partition Tolerance (system continues despite network partitions). Since partitions are inevitable in distributed systems, you choose between CP (consistent but may be unavailable during partitions — e.g., HBase) and AP (always available but may return stale data — e.g., DynamoDB, Cassandra).
6What is the difference between SQL and NoSQL databases and when do you choose each?
SQL (PostgreSQL, MySQL) enforces a schema, supports complex joins and ACID transactions — ideal when data relationships are complex and consistency is critical (banking, e-commerce orders). NoSQL: Document stores (MongoDB) for flexible schemas; Key-value (Redis) for caching and sessions; Wide-column (Cassandra) for write-heavy time-series at massive scale; Graph (Neo4j) for relationship-heavy queries. Most large systems use both (polyglot persistence).
7What is database replication and sharding?
Replication copies data to multiple servers: a primary handles writes; read replicas handle reads (eventual consistency). This improves read throughput and availability. Sharding (horizontal partitioning) splits data across multiple databases based on a shard key (user_id % n). Sharding improves write throughput and storage but complicates cross-shard queries and transactions. Replication is for availability; sharding is for scale.
8What is a message queue and when would you use one?
A message queue (Kafka, RabbitMQ, SQS) decouples producers from consumers, enabling async processing. Use cases: smooth traffic spikes by buffering requests, background jobs (email sending, report generation), event-driven microservices communication, guaranteed delivery with retries. Kafka specifically is used for high-throughput event streaming and maintaining an ordered log of events.
9What is a CDN and how does it work?
A CDN (Content Delivery Network) is a globally distributed network of edge servers that cache static assets (images, JS, CSS) close to users, reducing latency. When a user requests a file, the CDN serves it from the nearest edge node. On a cache miss, the edge fetches from the origin server and caches the response. CDNs also absorb DDoS traffic and offload origin server bandwidth.
10What is microservices architecture vs monolith?
A monolith is a single deployable unit containing all functionality — simpler to develop, test, and debug but hard to scale and deploy individual parts. Microservices decompose the system into small, independently deployable services each owning its data. Benefits: independent scaling, independent deployment, technology flexibility. Challenges: distributed system complexity (network calls, distributed transactions, service discovery, observability).
11How do microservices communicate?
Synchronous: REST over HTTP (simple, widely understood) or gRPC (binary, faster, typed contracts via Protocol Buffers — preferred for internal service-to-service). Asynchronous: message brokers (Kafka, RabbitMQ) for fire-and-forget or event-driven patterns. Async decouples services and improves resilience. Use sync for low-latency client-facing requests; async for background processing, high-throughput, and when strong decoupling matters.
12What is a rate limiter and how would you design one?
A rate limiter restricts the number of requests a client can make in a time window. Algorithms: Fixed Window (simple but allows bursts at window boundaries), Sliding Window Log (accurate but memory-intensive), Token Bucket (tokens accumulate at rate r, each request consumes one — allows short bursts), Leaky Bucket (requests queue and drain at a fixed rate). For distributed systems, store state in Redis with atomic Lua scripts or the INCR+EXPIRE pattern.
13What is consistent hashing?
Consistent hashing maps both data and nodes to a virtual ring (0 to 2^32). A key is assigned to the nearest node clockwise on the ring. When a node is added or removed, only the keys on that node's segment are redistributed (1/N keys on average), unlike naive modulo hashing where almost all keys move. Used in distributed caches and databases (Cassandra, DynamoDB) to minimise data movement during scaling.
14How would you design a notification system?
Components: notification service that receives events, a fan-out service to determine recipients, channel handlers (push, email, SMS, in-app), a message queue between them for async delivery, a user preferences service to filter channels, and a dedupe layer. At scale, Kafka fans out events to per-channel workers. Store notification history in a DB for in-app retrieval. Handle retries with exponential backoff and dead-letter queues.
15What is an API gateway?
An API gateway is a single entry point for all client requests to backend services. It handles cross-cutting concerns: authentication/authorisation, rate limiting, SSL termination, request routing, protocol translation, response caching, and logging. Examples: AWS API Gateway, Kong, NGINX. It simplifies clients (one endpoint instead of many) and moves infrastructure concerns out of individual services.
16What is eventual consistency vs strong consistency?
Strong consistency guarantees that after a write completes, all subsequent reads will return the updated value — requires coordination across replicas and reduces availability. Eventual consistency guarantees that given enough time without new updates, all replicas will converge to the same value — higher availability but reads may return stale data. Systems like DynamoDB and Cassandra offer tunable consistency levels between the two extremes.
Level up your prep
Get company-specific questions for your interview
Upload your resume → get questions tailored to Google, Amazon, TCS, and 50+ companies.
Try AI Interview Prep →
© 2025 CareerLens · Home · Interview Questions · Pricing