HomeInterview QuestionsSQL Interview QuestionsTop 20
🗄️
Free Study Guide · 2025

Top 20 SQLInterview Questions & Answers (2025)

SQL is tested in every backend, data engineering, data analyst, and SDE interview. Whether you're joining TCS as a fresher or interviewing for a data role at Flipkart or Swiggy, these are the queries and concepts you will be asked.

20 questions
Detailed answers
100% free
Also available:Top 10All 20 questions →
1What is the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN?
INNER JOIN returns rows that have matching values in both tables. LEFT JOIN returns all rows from the left table and matched rows from the right (NULL for no match). RIGHT JOIN is the mirror — all rows from the right. FULL OUTER JOIN returns all rows from both, with NULLs where there's no match. LEFT JOIN is the most commonly used; INNER JOIN is the default.
2What is the difference between WHERE and HAVING?
WHERE filters rows before grouping (before GROUP BY is applied) and cannot use aggregate functions. HAVING filters groups after GROUP BY and can use aggregate functions (HAVING COUNT(*) > 5). Rule of thumb: WHERE filters individual rows; HAVING filters aggregated groups. You can use both in the same query.
3What are indexes and how do they improve query performance?
An index is a data structure (typically a B-tree) that allows the database to find rows without scanning the entire table. Like a book index, it maps column values to row locations. Indexes dramatically speed up SELECT queries with WHERE, JOIN, and ORDER BY on indexed columns but slow down INSERT/UPDATE/DELETE (the index must be maintained). Create indexes on columns used in WHERE and JOIN conditions.
4What is database normalisation? Explain 1NF, 2NF, 3NF.
Normalisation reduces data redundancy and improves integrity. 1NF: each column has atomic (indivisible) values, no repeating groups. 2NF: 1NF + every non-key attribute is fully functionally dependent on the entire primary key (no partial dependency — matters for composite keys). 3NF: 2NF + no transitive dependencies (non-key attribute doesn't depend on another non-key attribute).
5What is a PRIMARY KEY vs UNIQUE KEY?
A PRIMARY KEY uniquely identifies each row, cannot be NULL, and there can only be one per table. A UNIQUE KEY also enforces uniqueness but allows one NULL and there can be multiple per table. Both create an index automatically. Use PRIMARY KEY for the entity identifier; use UNIQUE constraints for other columns that must be unique (email, username).
6What is a transaction? What are ACID properties?
A transaction is a sequence of operations executed as a single unit. ACID properties ensure reliability: Atomicity (all operations succeed or all are rolled back), Consistency (the database moves from one valid state to another), Isolation (concurrent transactions appear sequential), Durability (committed data survives system failures). Transactions are critical for financial operations.
7What is the difference between DELETE, TRUNCATE, and DROP?
DELETE removes specific rows based on a WHERE clause — it's DML, logged, and can be rolled back. TRUNCATE removes all rows from a table instantly — it's DDL, minimally logged, cannot be used with WHERE, and in many databases cannot be rolled back within a transaction. DROP removes the entire table structure and data permanently.
8What are window functions in SQL?
Window functions perform calculations across a set of rows related to the current row (a 'window') without collapsing them into groups like GROUP BY. Common functions: ROW_NUMBER(), RANK(), DENSE_RANK() for ranking; LAG(), LEAD() for accessing previous/next rows; SUM(), AVG() with OVER (PARTITION BY col ORDER BY col). Essential for analytics queries.
9What is a subquery vs a CTE?
A subquery is a query nested inside another query (in WHERE, FROM, or SELECT clause). A CTE (Common Table Expression) uses the WITH keyword to define a named, temporary result set you can reference multiple times in the main query. CTEs are more readable for complex queries, can be recursive, and some databases optimise them as materialised results.
10What is the difference between UNION and UNION ALL?
UNION combines result sets of two queries and removes duplicate rows (requires sorting/hashing). UNION ALL combines result sets and keeps all duplicates — it's faster because no deduplication is needed. Use UNION when duplicates are a concern; use UNION ALL for performance when you know results are distinct or duplicates are acceptable.
11How do you find duplicate records in a table?
Use GROUP BY and HAVING: SELECT email, COUNT(*) as cnt FROM users GROUP BY email HAVING COUNT(*) > 1. To see the full duplicate rows, join this back to the original table. To delete duplicates keeping only one, use a CTE with ROW_NUMBER() PARTITION BY the duplicate-identifying columns and delete rows where row_number > 1.
12What is a stored procedure vs a function?
A stored procedure is a precompiled block of SQL that can perform actions, has no mandatory return value, and can use OUT parameters and transactions. A function must return a value, can be used in SELECT statements and expressions, and cannot have side effects (in most databases). Use stored procedures for multi-step operations; functions for reusable computations in queries.
13What is query optimization? How do you tune a slow query?
Steps: (1) Use EXPLAIN/EXPLAIN ANALYZE to view the query plan and find full table scans. (2) Add indexes on WHERE and JOIN columns. (3) Avoid SELECT * — select only needed columns. (4) Avoid functions on indexed columns in WHERE (WHERE YEAR(created_at) = 2024 prevents index use; use WHERE created_at BETWEEN ... instead). (5) Check for N+1 query patterns. (6) Consider denormalisation or caching for frequently-run expensive reads.
14What are the different types of SQL commands (DDL, DML, DCL, TCL)?
DDL (Data Definition Language): CREATE, ALTER, DROP, TRUNCATE — defines schema structure. DML (Data Manipulation Language): SELECT, INSERT, UPDATE, DELETE — manipulates data. DCL (Data Control Language): GRANT, REVOKE — manages permissions. TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT — manages transactions.
15What is the N+1 query problem?
The N+1 problem occurs when you fetch N records and then execute one additional query per record to get related data — resulting in N+1 total queries. Example: fetching 100 users then querying each user's orders separately = 101 queries. Fix by using JOIN or a batch query (SELECT * FROM orders WHERE user_id IN (1,2,...,100)) to retrieve all related data in one or two queries.
16What is database sharding?
Sharding is horizontal partitioning of a database across multiple servers (shards), each holding a subset of the data. Shard key determines which shard a record belongs to (e.g., user_id % num_shards). Benefits: scales writes and storage beyond a single server. Challenges: cross-shard queries are expensive, resharding is hard, and transactions across shards are complex.
17What is a foreign key and what does ON DELETE CASCADE do?
A foreign key is a column (or group of columns) in a table that references the primary key of another table, enforcing referential integrity. ON DELETE CASCADE means when the referenced parent row is deleted, all child rows with that foreign key are automatically deleted too. Other options: SET NULL (set foreign key to NULL), RESTRICT (prevent deletion if children exist).
18Write a SQL query to find the second highest salary.
Method 1 (subquery): SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees). Method 2 (LIMIT/OFFSET): SELECT DISTINCT salary FROM employees ORDER BY salary DESC LIMIT 1 OFFSET 1. Method 3 (window function): SELECT salary FROM (SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk FROM employees) t WHERE rnk = 2.
19What is the difference between a clustered and a non-clustered index?
A clustered index determines the physical order of rows in the table — the table's data is stored in the order of the clustered index. There can be only one per table (usually the primary key). A non-clustered index is a separate structure that stores the indexed columns with pointers back to the actual row. A table can have many non-clustered indexes. Clustered index lookups are faster for range scans.
20What is an execution plan and why is it useful?
An execution plan (shown by EXPLAIN in PostgreSQL/MySQL) describes the steps the query optimizer chose to execute a SQL statement — join methods (hash join, nested loop), index usage, estimated row counts, and cost. Reading it helps you understand why a query is slow: full table scans (Seq Scan), missing indexes, bad row estimates, or inefficient join order. It's the first tool to reach for when tuning queries.
Level up your prep
Get company-specific SQL questions
Upload your resume → get questions tailored to Google, Amazon, TCS, and 50+ companies.
Try AI Interview Prep →
© 2025 CareerLens · Home · Interview Questions · Pricing