Question 1

What is a DBMS? How is it different from a file system?

Accepted Answer

A Database Management System (DBMS) is software that manages, stores, and retrieves structured data efficiently. Unlike a file system, a DBMS provides: data integrity enforcement, concurrent access control, query language (SQL), role-based security, crash recovery, and elimination of data redundancy. File systems store raw files with no built-in relationships, constraints, or query capability — you'd have to write custom code for everything a DBMS gives you out of the box.

Question 2

What is the difference between DBMS and RDBMS?

Accepted Answer

A DBMS stores data without enforcing relationships between tables (e.g., file-based or hierarchical databases like IMS). An RDBMS (Relational DBMS) organises data into tables with defined relationships enforced via primary keys, foreign keys, and referential integrity constraints. RDBMS follows E.F. Codd's 12 rules for relational databases. Examples: MySQL, PostgreSQL, Oracle, SQL Server are RDBMS; MongoDB (document store) is a NoSQL DBMS.

Question 3

What are ACID properties?

Accepted Answer

ACID ensures reliable database transactions: Atomicity — a transaction is all-or-nothing (either all operations succeed or none are committed); Consistency — a transaction brings the database from one valid state to another, never violating constraints; Isolation — concurrent transactions execute as if they were sequential (no dirty reads, phantom reads); Durability — once committed, data survives system crashes (written to disk/WAL). ACID is the foundation of transactional integrity in RDBMS.

Question 4

What is normalization? What are 1NF, 2NF, and 3NF?

Accepted Answer

Normalization eliminates redundancy and ensures data integrity by organising tables according to normal forms. 1NF: Each column must contain atomic (indivisible) values and each row must be unique — no repeating groups or arrays. 2NF: Must be in 1NF + every non-key attribute must depend on the entire primary key (no partial dependencies — only matters when the key is composite). 3NF: Must be in 2NF + no transitive dependencies (non-key columns must not depend on other non-key columns). BCNF (Boyce-Codd): stricter 3NF — every determinant must be a candidate key.

Question 5

What is denormalization and when would you use it?

Accepted Answer

Denormalization intentionally introduces redundancy into a normalized schema to improve read performance. By storing pre-joined data, you eliminate costly JOIN operations at query time. Use it in read-heavy analytics systems, reporting tables, or data warehouses where writes are infrequent but queries are complex. Example: storing customer_name in an Orders table even though it exists in Customers — saves a JOIN on every order query. The tradeoff: updates become harder (must update in multiple places).

Question 6

What is a primary key, foreign key, and unique key?

Accepted Answer

Primary Key (PK): uniquely identifies each row in a table, cannot be NULL, one per table. Foreign Key (FK): a column in one table that references the PK of another table — enforces referential integrity (you can't insert an order for a customer_id that doesn't exist). Unique Key: enforces uniqueness like PK but allows one NULL value and a table can have multiple unique keys. Example: email column in Users table can be a unique key.

Question 7

What is a transaction? What are its states?

Accepted Answer

A transaction is a logical unit of work consisting of one or more SQL operations that must all succeed or all fail together. Transaction states: Active (executing), Partially Committed (last statement executed, awaiting commit), Committed (all changes made permanent), Failed (error occurred, rollback needed), Aborted (rolled back, database restored to pre-transaction state). Use BEGIN TRANSACTION, COMMIT, ROLLBACK to control transactions in SQL.

Question 8

What are the different types of JOIN in SQL?

Accepted Answer

INNER JOIN: returns only rows where there's a match in both tables. LEFT JOIN: returns all rows from the left table plus matched rows from the right (NULLs for no match). RIGHT JOIN: all rows from the right + matched from left. FULL OUTER JOIN: all rows from both tables with NULLs where no match. CROSS JOIN: Cartesian product — every combination of rows. SELF JOIN: joins a table to itself (e.g., finding employees with the same manager). NATURAL JOIN: auto-joins on columns with the same name.

Question 9

What is indexing in DBMS? What are its types?

Accepted Answer

An index is a data structure (typically a B-tree or hash) that speeds up data retrieval at the cost of additional storage and slower writes. Types: Primary Index — on the primary key, data file is physically sorted by it. Secondary Index — on non-key attributes, data file is NOT sorted by it (also called non-clustered). Clustered Index — determines the physical order of data on disk; a table can have only one. Non-Clustered Index — a separate structure pointing to row locations; a table can have many. Dense Index — entry for every search key. Sparse Index — entry for only some search keys (in a sorted file).

Question 10

What is the difference between clustered and non-clustered index?

Accepted Answer

Clustered index determines the physical sort order of data in the table — the table IS the index. A table can have only one clustered index (usually the PK). Data retrieval is fastest for range queries on the clustered key. Non-clustered index is a separate B-tree structure containing the indexed column(s) plus a pointer (row locator) to the actual data row. A table can have many non-clustered indexes. Non-clustered is slower than clustered but more flexible.

Question 11

What is a deadlock in DBMS? How can it be prevented?

Accepted Answer

A deadlock occurs when two or more transactions are each waiting for the other to release a lock, creating a circular wait. Example: T1 holds lock on A and wants B; T2 holds lock on B and wants A — both wait forever. Prevention strategies: Lock Ordering (always acquire locks in the same order), Timeouts (abort a transaction that waits too long), Wait-Die / Wound-Wait schemes (priority-based preemption), or Deadlock Detection (let it happen, then abort one transaction using a cycle-detection algorithm).

Question 12

What are the different types of database locks?

Accepted Answer

Shared Lock (S): allows multiple transactions to read the same data simultaneously; no writes allowed while shared locks exist. Exclusive Lock (X): only one transaction can hold it; blocks all other reads and writes. Intent Locks: signal to higher-level locks that a lower-level lock will be acquired (Intent Shared IS, Intent Exclusive IX). Two-Phase Locking (2PL): growing phase (acquire locks), shrinking phase (release locks) — guarantees serializability but can deadlock.

Question 13

What is concurrency control? What are dirty read, non-repeatable read, and phantom read?

Accepted Answer

Concurrency control ensures correct execution of simultaneous transactions. Dirty Read: reading data written by an uncommitted transaction (which might be rolled back). Non-Repeatable Read: reading the same row twice within one transaction gets different values (another transaction updated/deleted it between reads). Phantom Read: re-running a query returns different rows (another transaction inserted/deleted rows matching the WHERE clause). SQL isolation levels — Read Uncommitted, Read Committed, Repeatable Read, Serializable — progressively prevent more of these anomalies.

Question 14

What is a view in SQL?

Accepted Answer

A view is a virtual table based on a SELECT query stored in the database. It doesn't store data itself — it queries the underlying tables on every access. Uses: simplify complex queries (define once, use by name), security (expose only certain columns to users), and abstraction (hide schema changes). Updatable views allow INSERT/UPDATE/DELETE if they map to a single base table with no aggregation. Materialized views (in PostgreSQL, Oracle) physically store the query result and can be refreshed periodically for performance.

Question 15

What is the difference between DELETE, TRUNCATE, and DROP?

Accepted Answer

DELETE: DML command, removes rows one by one with a WHERE clause, writes to transaction log (can be rolled back), triggers fire, slower. TRUNCATE: DDL command, removes ALL rows at once without WHERE, minimal logging (much faster), cannot be rolled back in most RDBMS (in PostgreSQL it can), triggers do NOT fire, resets identity columns. DROP: DDL command, deletes the entire table structure and all data permanently — cannot be rolled back. Think: DELETE removes content, TRUNCATE empties the container, DROP destroys the container.

Question 16

What is a stored procedure and how is it different from a function?

Accepted Answer

A stored procedure is a precompiled set of SQL statements stored in the database, executed with CALL or EXEC. It can perform DML, DDL, call other procedures, and return multiple result sets. A function must return a single value (scalar) or a table, can be used inside SELECT statements, and cannot perform DML (in most RDBMS). Key differences: functions are used in expressions, procedures are called standalone; functions cannot have side effects (ideally); procedures can have OUT parameters, transactions, and error handling.

Question 17

What is a trigger?

Accepted Answer

A trigger is a stored procedure that automatically executes in response to specific DML events (INSERT, UPDATE, DELETE) on a table. Types: BEFORE trigger (runs before the operation — useful for validation/transformation), AFTER trigger (runs after — useful for audit logging, cascading actions), INSTEAD OF trigger (replaces the operation — used on views). Example: an AFTER INSERT trigger on Orders could automatically update inventory in a Products table. Use triggers sparingly — they add hidden logic that's hard to debug.

Question 18

What is an ER diagram?

Accepted Answer

An Entity-Relationship (ER) diagram is a visual representation of data and the relationships between entities. Key components: Entities (rectangles) — real-world objects like Student, Course. Attributes (ellipses) — properties of entities (name, ID). Relationships (diamonds) — associations between entities (enrolled-in). Cardinality: one-to-one (1:1), one-to-many (1:N), many-to-many (M:N). ER diagrams are the first step in database design — used to plan the schema before writing SQL.

Question 19

What is the difference between SQL and NoSQL?

Accepted Answer

SQL (Relational): structured tables with fixed schema, ACID transactions, powerful JOIN/aggregation queries, scales vertically. Best for: financial systems, ERP, apps requiring complex queries and strong consistency. NoSQL: flexible/schema-less (document, key-value, column, graph stores), BASE consistency model, horizontal scaling, often eventual consistency. Best for: high-write workloads, unstructured data, real-time apps, social graphs. Examples: MongoDB (document), Redis (key-value), Cassandra (column), Neo4j (graph). Choose based on data structure, consistency needs, and scale.

Question 20

What is functional dependency?

Accepted Answer

A functional dependency X → Y means that the value of X uniquely determines the value of Y. Example: StudentID → StudentName means knowing the StudentID tells you the StudentName. Trivial dependency: Y is a subset of X (X → X is always true). Non-trivial: Y is not a subset of X. Functional dependencies drive normalization — 2NF eliminates partial dependencies (non-key → part of composite key); 3NF eliminates transitive dependencies (non-key → non-key → key).

Question 21

What is a checkpoint in DBMS?

Accepted Answer

A checkpoint is a mechanism where the DBMS writes all dirty (modified but uncommitted) pages from the buffer to disk and records the current state in the log. Purpose: speed up crash recovery — on restart, instead of replaying the entire log, the system only replays transactions after the last checkpoint. Types: Sharp checkpoint (halts all writes briefly), Fuzzy checkpoint (allows concurrent writes). Checkpoints are why databases can recover quickly from crashes without losing committed data.

Top 30 DBMS Interview Questions & Answers for Freshers (2025)