Question 1

What is the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN?

Accepted Answer

INNER JOIN returns rows that have matching values in both tables. LEFT JOIN returns all rows from the left table and matched rows from the right (NULL for no match). RIGHT JOIN is the mirror — all rows from the right. FULL OUTER JOIN returns all rows from both, with NULLs where there's no match. LEFT JOIN is the most commonly used; INNER JOIN is the default.

Question 2

What is the difference between WHERE and HAVING?

Accepted Answer

WHERE filters rows before grouping (before GROUP BY is applied) and cannot use aggregate functions. HAVING filters groups after GROUP BY and can use aggregate functions (HAVING COUNT(*) > 5). Rule of thumb: WHERE filters individual rows; HAVING filters aggregated groups. You can use both in the same query.

Question 3

What are indexes and how do they improve query performance?

Accepted Answer

An index is a data structure (typically a B-tree) that allows the database to find rows without scanning the entire table. Like a book index, it maps column values to row locations. Indexes dramatically speed up SELECT queries with WHERE, JOIN, and ORDER BY on indexed columns but slow down INSERT/UPDATE/DELETE (the index must be maintained). Create indexes on columns used in WHERE and JOIN conditions.

Question 4

What is database normalisation? Explain 1NF, 2NF, 3NF.

Accepted Answer

Normalisation reduces data redundancy and improves integrity. 1NF: each column has atomic (indivisible) values, no repeating groups. 2NF: 1NF + every non-key attribute is fully functionally dependent on the entire primary key (no partial dependency — matters for composite keys). 3NF: 2NF + no transitive dependencies (non-key attribute doesn't depend on another non-key attribute).

Question 5

What is a PRIMARY KEY vs UNIQUE KEY?

Accepted Answer

A PRIMARY KEY uniquely identifies each row, cannot be NULL, and there can only be one per table. A UNIQUE KEY also enforces uniqueness but allows one NULL and there can be multiple per table. Both create an index automatically. Use PRIMARY KEY for the entity identifier; use UNIQUE constraints for other columns that must be unique (email, username).

Question 6

What is a transaction? What are ACID properties?

Accepted Answer

A transaction is a sequence of operations executed as a single unit. ACID properties ensure reliability: Atomicity (all operations succeed or all are rolled back), Consistency (the database moves from one valid state to another), Isolation (concurrent transactions appear sequential), Durability (committed data survives system failures). Transactions are critical for financial operations.

Question 7

What is the difference between DELETE, TRUNCATE, and DROP?

Accepted Answer

DELETE removes specific rows based on a WHERE clause — it's DML, logged, and can be rolled back. TRUNCATE removes all rows from a table instantly — it's DDL, minimally logged, cannot be used with WHERE, and in many databases cannot be rolled back within a transaction. DROP removes the entire table structure and data permanently.

Question 8

What are window functions in SQL?

Accepted Answer

Window functions perform calculations across a set of rows related to the current row (a 'window') without collapsing them into groups like GROUP BY. Common functions: ROW_NUMBER(), RANK(), DENSE_RANK() for ranking; LAG(), LEAD() for accessing previous/next rows; SUM(), AVG() with OVER (PARTITION BY col ORDER BY col). Essential for analytics queries.

Question 9

What is a subquery vs a CTE?

Accepted Answer

A subquery is a query nested inside another query (in WHERE, FROM, or SELECT clause). A CTE (Common Table Expression) uses the WITH keyword to define a named, temporary result set you can reference multiple times in the main query. CTEs are more readable for complex queries, can be recursive, and some databases optimise them as materialised results.

Question 10

What is the difference between UNION and UNION ALL?

Accepted Answer

UNION combines result sets of two queries and removes duplicate rows (requires sorting/hashing). UNION ALL combines result sets and keeps all duplicates — it's faster because no deduplication is needed. Use UNION when duplicates are a concern; use UNION ALL for performance when you know results are distinct or duplicates are acceptable.

Question 11

How do you find duplicate records in a table?

Accepted Answer

Use GROUP BY and HAVING: SELECT email, COUNT(*) as cnt FROM users GROUP BY email HAVING COUNT(*) > 1. To see the full duplicate rows, join this back to the original table. To delete duplicates keeping only one, use a CTE with ROW_NUMBER() PARTITION BY the duplicate-identifying columns and delete rows where row_number > 1.

Question 12

What is a stored procedure vs a function?

Accepted Answer

A stored procedure is a precompiled block of SQL that can perform actions, has no mandatory return value, and can use OUT parameters and transactions. A function must return a value, can be used in SELECT statements and expressions, and cannot have side effects (in most databases). Use stored procedures for multi-step operations; functions for reusable computations in queries.

Question 13

What is query optimization? How do you tune a slow query?

Accepted Answer

Steps: (1) Use EXPLAIN/EXPLAIN ANALYZE to view the query plan and find full table scans. (2) Add indexes on WHERE and JOIN columns. (3) Avoid SELECT * — select only needed columns. (4) Avoid functions on indexed columns in WHERE (WHERE YEAR(created_at) = 2024 prevents index use; use WHERE created_at BETWEEN ... instead). (5) Check for N+1 query patterns. (6) Consider denormalisation or caching for frequently-run expensive reads.

Question 14

What are the different types of SQL commands (DDL, DML, DCL, TCL)?

Accepted Answer

DDL (Data Definition Language): CREATE, ALTER, DROP, TRUNCATE — defines schema structure. DML (Data Manipulation Language): SELECT, INSERT, UPDATE, DELETE — manipulates data. DCL (Data Control Language): GRANT, REVOKE — manages permissions. TCL (Transaction Control Language): COMMIT, ROLLBACK, SAVEPOINT — manages transactions.

Question 15

What is the N+1 query problem?

Accepted Answer

The N+1 problem occurs when you fetch N records and then execute one additional query per record to get related data — resulting in N+1 total queries. Example: fetching 100 users then querying each user's orders separately = 101 queries. Fix by using JOIN or a batch query (SELECT * FROM orders WHERE user_id IN (1,2,...,100)) to retrieve all related data in one or two queries.

Question 16

What is database sharding?

Accepted Answer

Sharding is horizontal partitioning of a database across multiple servers (shards), each holding a subset of the data. Shard key determines which shard a record belongs to (e.g., user_id % num_shards). Benefits: scales writes and storage beyond a single server. Challenges: cross-shard queries are expensive, resharding is hard, and transactions across shards are complex.

Question 17

What is a foreign key and what does ON DELETE CASCADE do?

Accepted Answer

A foreign key is a column (or group of columns) in a table that references the primary key of another table, enforcing referential integrity. ON DELETE CASCADE means when the referenced parent row is deleted, all child rows with that foreign key are automatically deleted too. Other options: SET NULL (set foreign key to NULL), RESTRICT (prevent deletion if children exist).

Question 18

Write a SQL query to find the second highest salary.

Accepted Answer

Method 1 (subquery): SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees). Method 2 (LIMIT/OFFSET): SELECT DISTINCT salary FROM employees ORDER BY salary DESC LIMIT 1 OFFSET 1. Method 3 (window function): SELECT salary FROM (SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk FROM employees) t WHERE rnk = 2.

Question 19

What is the difference between a clustered and a non-clustered index?

Accepted Answer

A clustered index determines the physical order of rows in the table — the table's data is stored in the order of the clustered index. There can be only one per table (usually the primary key). A non-clustered index is a separate structure that stores the indexed columns with pointers back to the actual row. A table can have many non-clustered indexes. Clustered index lookups are faster for range scans.

Question 20

What is an execution plan and why is it useful?

Accepted Answer

An execution plan (shown by EXPLAIN in PostgreSQL/MySQL) describes the steps the query optimizer chose to execute a SQL statement — join methods (hash join, nested loop), index usage, estimated row counts, and cost. Reading it helps you understand why a query is slow: full table scans (Seq Scan), missing indexes, bad row estimates, or inefficient join order. It's the first tool to reach for when tuning queries.

Top 20 SQLInterview Questions & Answers (2025)