SQL Joins - Interview Questions and Answers

SQL Joins are used to combine data from multiple tables based on a related column. Joins help retrieve meaningful insights from relational databases.

For example, in an Indian retail database:

  • Customers table (CustomerID, Name, City)
  • Orders table (OrderID, CustomerID, Amount)

A JOIN can combine customer details with their order history.

  • INNER JOIN: Returns matching rows from both tables.
  • OUTER JOIN: Returns matching and non-matching rows, filling gaps with NULLs.
-- INNER JOIN (Only matching customers with orders)
SELECT Customers.Name, Orders.Amount
FROM Customers
INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

Returns all rows from the left table, even if there's no match in the right table.

-- Show all customers, even those who haven?t placed orders
SELECT Customers.Name, Orders.Amount
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

Returns all rows from the right table, even if there's no match in the left table.

-- Show all orders, even if some customers are missing
SELECT Customers.Name, Orders.Amount
FROM Customers
RIGHT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

Returns all records from both tables, filling gaps with NULLs.

SELECT Customers.Name, Orders.Amount
FROM Customers
FULL OUTER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

Produces a cartesian product, multiplying each row from table A with each row from table B.

SELECT Employees.Name, Departments.DepartmentName
FROM Employees
CROSS JOIN Departments;

Used in scenarios like generating price combinations or test data.

A SELF JOIN joins a table to itself.

Example: Find Indian employees and their managers from the same table.

SELECT E1.Name AS Employee, E2.Name AS Manager
FROM Employees E1
JOIN Employees E2 ON E1.ManagerID = E2.EmployeeID;

  • JOIN merges columns from two tables.
  • UNION merges rows from two queries.
-- UNION Example (Combine lists of Mumbai & Delhi customers)
SELECT Name FROM Customers WHERE City = 'Mumbai'
UNION
SELECT Name FROM Customers WHERE City = 'Delhi';

It creates a CROSS JOIN, generating all possible combinations of rows.

Yes, you can join multiple tables.

SELECT Customers.Name, Orders.Amount, Payments.PaymentDate
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
JOIN Payments ON Orders.OrderID = Payments.OrderID;

  • LEFT JOIN  - All rows from the left table + matches from the right table.
  • RIGHT JOIN - All rows from the right table + matches from the left table.

  • CROSS JOIN  - Produces all possible combinations.
  • INNER JOIN  - Returns only matching rows.

It fills missing values with NULL where no match is found.

  • EQUI JOIN - Uses = operator (Example: ON A.ID = B.ID)
  • NON-EQUI JOIN - Uses <, >, <=, etc.
SELECT Employees.Name, Salaries.Amount
FROM Employees
JOIN Salaries ON Employees.Salary > Salaries.MinSalary;

  • ON - Filters data before joining.
  • WHERE - Filters data after joining.
SELECT * FROM Orders
LEFT JOIN Customers ON Orders.CustomerID = Customers.CustomerID
WHERE Customers.City = 'Delhi';

  • NATURAL JOIN automatically finds common columns.
  • INNER JOIN requires explicit ON conditions.
SELECT * FROM Customers NATURAL JOIN Orders;

Yes, it?s used for aggregations.

SELECT City, COUNT(Orders.OrderID)
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
GROUP BY City;

SELECT Customers.Name, Orders.Amount
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
ORDER BY Orders.Amount DESC;

Used for filtering aggregated data.

SELECT City, COUNT(OrderID) AS OrderCount
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
GROUP BY City
HAVING COUNT(OrderID) > 10;

  • Indexes improve JOIN performance by making lookups faster.
  • Clustered Indexes store data in sorted order.
  • Non-clustered Indexes store pointers to data.
CREATE INDEX idx_CustomerID ON Orders(CustomerID);

A join using multiple columns.

SELECT * FROM Orders
JOIN Payments ON Orders.OrderID = Payments.OrderID
AND Orders.CustomerID = Payments.CustomerID;

Used for hierarchical data.

SELECT E1.Name AS Employee, E2.Name AS Manager
FROM Employees E1
JOIN Employees E2 ON E1.ManagerID = E2.EmployeeID;

Using multiple joins can slow down performance if:

  • Indexes are missing on join columns.
  • Too many tables are joined at once.
  • Joins involve large datasets without filters.

Optimization Tip: Use EXPLAIN PLAN in SQL Server or MySQL to check query execution.

EXPLAIN SELECT * FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
JOIN Payments ON Orders.OrderID = Payments.OrderID;

  1. Use Indexes on join columns.
  2. Filter data early using WHERE or ON.
  3. Use proper join order (smaller datasets first).
  4. **Avoid SELECT *** (fetch only needed columns).
  5. Consider partitioning for huge tables.

Example: Using an index on CustomerID for faster lookups

CREATE INDEX idx_customerID ON Orders(CustomerID);

 

An ANTI JOIN finds rows in one table that do not have a match in another.

Example: Find customers who haven?t placed any orders

SELECT Customers.Name
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
WHERE Orders.CustomerID IS NULL;

 

Since FULL OUTER JOIN is not supported in MySQL, it can be replaced with UNION.

SELECT Customers.Name, Orders.Amount
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID

UNION

SELECT Customers.Name, Orders.Amount
FROM Customers
RIGHT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

 

These are SQL Server execution plans:

  • Nested Loop Join: Best for small datasets.
  • Hash Join: Used for large, unsorted data.
  • Merge Join: Used when both tables are sorted.
SELECT * FROM Employees
JOIN Salaries ON Employees.Salary = Salaries.Amount
OPTION (MERGE JOIN);

 

Example: Get all Delhi customers who placed an order above Rs. 5000

SELECT Name FROM Customers
WHERE CustomerID IN (
    SELECT CustomerID FROM Orders WHERE Amount > 5000
);

Used for hierarchical data (e.g., company hierarchy).

Example: Find an employee's chain of managers

WITH EmployeeHierarchy AS (
    SELECT EmployeeID, Name, ManagerID
    FROM Employees
    WHERE EmployeeID = 101  -- Start with an Employee
    UNION ALL
    SELECT E.EmployeeID, E.Name, E.ManagerID
    FROM Employees E
    JOIN EmployeeHierarchy EH ON E.ManagerID = EH.EmployeeID
)
SELECT * FROM EmployeeHierarchy;

 

Database-Specific Behavior

  • PostgreSQL / MySQL 8+ - WITH RECURSIVE is required.
  • SQL Server / Oracle - WITH alone is sufficient (recursive behavior is inferred).
  • SQLite - WITH RECURSIVE is required.

Window functions like RANK() or DENSE_RANK() work with joins for ranking within groups.

Example: Find top 3 highest salaries per department

SELECT E.Name, E.Department, S.Amount,
       RANK() OVER (PARTITION BY E.Department ORDER BY S.Amount DESC) AS Rank
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
WHERE Rank <= 3;

 

SELECT Name FROM Employees WHERE ManagerID IS NULL;

SELECT Customers.Name FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
WHERE Orders.CustomerID IS NULL;

 

SELECT O.*
FROM Orders O
JOIN (
    SELECT CustomerID, MAX(OrderDate) AS LatestOrder
    FROM Orders
    GROUP BY CustomerID
) Latest ON O.CustomerID = Latest.CustomerID AND O.OrderDate = Latest.LatestOrder;

SELECT Departments.DepartmentName
FROM Departments
LEFT JOIN Employees ON Departments.DepartmentID = Employees.DepartmentID
WHERE Employees.EmployeeID IS NULL;

SELECT Name, COUNT(*)
FROM Customers
GROUP BY Name
HAVING COUNT(*) > 1;

SELECT Name, Department, Salary
FROM (
    SELECT Name, Department, Salary,
           DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS rnk
    FROM Employees
) Ranked
WHERE rnk = 2;

SELECT A.Name, B.Amount
FROM db1.Customers A
JOIN db2.Orders B ON A.CustomerID = B.CustomerID;

  • Use indexes
  • Use partitioning
  • Filter data before joining
  • Use EXISTS instead of IN()

Pagination is useful for displaying large datasets page by page.

Example: Fetching page 2 with 10 records per page

SELECT Customers.Name, Orders.Amount
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
ORDER BY Orders.OrderDate DESC
LIMIT 10 OFFSET 10;
  • LIMIT 10 - Fetch 10 records
  • OFFSET 10 - Skip the first 10

Used to track changes in employee salaries.

SELECT E1.EmployeeID, E1.Name, E1.Salary AS CurrentSalary, E2.Salary AS PreviousSalary
FROM Employees E1
LEFT JOIN EmployeeHistory E2 ON E1.EmployeeID = E2.EmployeeID
WHERE E1.Salary <> E2.Salary;

This compares current salaries (Employees table) vs. past salaries (EmployeeHistory table).

A derived table is a subquery used inside a JOIN.

Example: Fetch customers who have spent more than Rs.10,000

SELECT C.Name, Spending.TotalAmount
FROM Customers C
JOIN (
    SELECT CustomerID, SUM(Amount) AS TotalAmount
    FROM Orders
    GROUP BY CustomerID
    HAVING SUM(Amount) > 10000
) Spending ON C.CustomerID = Spending.CustomerID;

 

Create indexes on join columns for faster lookups.

CREATE INDEX idx_customerID ON Orders(CustomerID);

This improves join performance between Customers and Orders.

  • INNER JOIN removes non-matching rows.
  • OUTER JOIN keeps non-matching rows but fills NULLs.

Example: Using WHERE incorrectly in a LEFT JOIN

SELECT C.Name, O.Amount
FROM Customers C
LEFT JOIN Orders O ON C.CustomerID = O.CustomerID
WHERE O.Amount > 5000;  --  This converts it into an INNER JOIN

Instead, filter inside JOIN condition

SELECT C.Name, O.Amount
FROM Customers C
LEFT JOIN Orders O ON C.CustomerID = O.CustomerID AND O.Amount > 5000;

 

CTEs help break down complex joins into readable parts.

Example: Find employees earning above the department average

WITH AvgSalary AS (
    SELECT Department, AVG(Salary) AS DeptAvg
    FROM Employees
    GROUP BY Department
)
SELECT E.Name, E.Salary, A.DeptAvg
FROM Employees E
JOIN AvgSalary A ON E.Department = A.Department
WHERE E.Salary > A.DeptAvg;

 

  1. Denormalization (store combined data in one table)
  2. Use EXISTS instead of IN
  3. Use Materialized Views for precomputed joins
  4. Index foreign keys

 

Use COALESCE() to replace NULL values.

SELECT Customers.Name, COALESCE(Orders.Amount, 0) AS OrderAmount
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

If a customer has no orders, it returns 0 instead of NULL.

Denormalization reduces joins by storing redundant data, improving performance.

Example: Instead of joining Orders and Customers, store CustomerName in Orders.

ALTER TABLE Orders ADD COLUMN CustomerName VARCHAR(255);
UPDATE Orders O
JOIN Customers C ON O.CustomerID = C.CustomerID
SET O.CustomerName = C.Name;

 

Joins: Faster for large datasets, better performance with indexes.
Subqueries: Simpler to read but can be slower.

Example: Fetching customer names using JOIN vs. Subquery

-- Using JOIN
SELECT C.Name, O.Amount
FROM Customers C
JOIN Orders O ON C.CustomerID = O.CustomerID;

-- Using Subquery
SELECT Name, (SELECT SUM(Amount) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS TotalSpent
FROM Customers;

Materialized Views store the result of a JOIN, making retrieval faster.

CREATE MATERIALIZED VIEW CustomerOrders AS
SELECT C.Name, O.Amount
FROM Customers C
JOIN Orders O ON C.CustomerID = O.CustomerID;

Instead of running the JOIN every time, querying the view is much faster.

Common mistakes and fixes:

Forgetting indexes on join columns - CREATE INDEX idx_column ON Table(Column);
**Using SELECT *** instead of selecting only necessary columns
Using WHERE with LEFT JOIN incorrectly - Use conditions inside ON
Not handling NULL values properly

Share   Share