SQL Joins - Interview Questions and Answers
SQL Joins are used to combine data from multiple tables based on a related column. Joins help retrieve meaningful insights from relational databases.
For example, in an Indian retail database:
Customers
table (CustomerID, Name, City)Orders
table (OrderID, CustomerID, Amount)
A JOIN
can combine customer details with their order history.
- INNER JOIN: Returns matching rows from both tables.
- OUTER JOIN: Returns matching and non-matching rows, filling gaps with NULLs.
-- INNER JOIN (Only matching customers with orders)
SELECT Customers.Name, Orders.Amount
FROM Customers
INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
Returns all rows from the left table, even if there's no match in the right table.
-- Show all customers, even those who haven?t placed orders
SELECT Customers.Name, Orders.Amount
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
Returns all rows from the right table, even if there's no match in the left table.
-- Show all orders, even if some customers are missing
SELECT Customers.Name, Orders.Amount
FROM Customers
RIGHT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
Returns all records from both tables, filling gaps with NULLs.
SELECT Customers.Name, Orders.Amount
FROM Customers
FULL OUTER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
Produces a cartesian product, multiplying each row from table A with each row from table B.
SELECT Employees.Name, Departments.DepartmentName
FROM Employees
CROSS JOIN Departments;
Used in scenarios like generating price combinations or test data.
A SELF JOIN joins a table to itself.
Example: Find Indian employees and their managers from the same table.
SELECT E1.Name AS Employee, E2.Name AS Manager
FROM Employees E1
JOIN Employees E2 ON E1.ManagerID = E2.EmployeeID;
JOIN
merges columns from two tables.UNION
merges rows from two queries.
-- UNION Example (Combine lists of Mumbai & Delhi customers)
SELECT Name FROM Customers WHERE City = 'Mumbai'
UNION
SELECT Name FROM Customers WHERE City = 'Delhi';
It creates a CROSS JOIN, generating all possible combinations of rows.
Yes, you can join multiple tables.
SELECT Customers.Name, Orders.Amount, Payments.PaymentDate
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
JOIN Payments ON Orders.OrderID = Payments.OrderID;
- LEFT JOIN - All rows from the left table + matches from the right table.
- RIGHT JOIN - All rows from the right table + matches from the left table.
- CROSS JOIN - Produces all possible combinations.
- INNER JOIN - Returns only matching rows.
It fills missing values with NULL where no match is found.
- EQUI JOIN - Uses
=
operator (Example:ON A.ID = B.ID
) - NON-EQUI JOIN - Uses
<
,>
,<=
, etc.
SELECT Employees.Name, Salaries.Amount
FROM Employees
JOIN Salaries ON Employees.Salary > Salaries.MinSalary;
- ON - Filters data before joining.
- WHERE - Filters data after joining.
SELECT * FROM Orders
LEFT JOIN Customers ON Orders.CustomerID = Customers.CustomerID
WHERE Customers.City = 'Delhi';
- NATURAL JOIN automatically finds common columns.
- INNER JOIN requires explicit
ON
conditions.
SELECT * FROM Customers NATURAL JOIN Orders;
Yes, it?s used for aggregations.
SELECT City, COUNT(Orders.OrderID)
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
GROUP BY City;
SELECT Customers.Name, Orders.Amount
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
ORDER BY Orders.Amount DESC;
Used for filtering aggregated data.
SELECT City, COUNT(OrderID) AS OrderCount
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
GROUP BY City
HAVING COUNT(OrderID) > 10;
- Indexes improve JOIN performance by making lookups faster.
- Clustered Indexes store data in sorted order.
- Non-clustered Indexes store pointers to data.
CREATE INDEX idx_CustomerID ON Orders(CustomerID);
A join using multiple columns.
SELECT * FROM Orders
JOIN Payments ON Orders.OrderID = Payments.OrderID
AND Orders.CustomerID = Payments.CustomerID;
Used for hierarchical data.
SELECT E1.Name AS Employee, E2.Name AS Manager
FROM Employees E1
JOIN Employees E2 ON E1.ManagerID = E2.EmployeeID;
Using multiple joins can slow down performance if:
- Indexes are missing on join columns.
- Too many tables are joined at once.
- Joins involve large datasets without filters.
Optimization Tip: Use EXPLAIN PLAN in SQL Server or MySQL to check query execution.
EXPLAIN SELECT * FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
JOIN Payments ON Orders.OrderID = Payments.OrderID;
- Use Indexes on join columns.
- Filter data early using
WHERE
orON
. - Use proper join order (smaller datasets first).
- **Avoid SELECT *** (fetch only needed columns).
- Consider partitioning for huge tables.
Example: Using an index on CustomerID for faster lookups
CREATE INDEX idx_customerID ON Orders(CustomerID);
An ANTI JOIN finds rows in one table that do not have a match in another.
Example: Find customers who haven?t placed any orders
SELECT Customers.Name
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
WHERE Orders.CustomerID IS NULL;
Since FULL OUTER JOIN is not supported in MySQL, it can be replaced with UNION.
SELECT Customers.Name, Orders.Amount
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
UNION
SELECT Customers.Name, Orders.Amount
FROM Customers
RIGHT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
These are SQL Server execution plans:
- Nested Loop Join: Best for small datasets.
- Hash Join: Used for large, unsorted data.
- Merge Join: Used when both tables are sorted.
SELECT * FROM Employees
JOIN Salaries ON Employees.Salary = Salaries.Amount
OPTION (MERGE JOIN);
Example: Get all Delhi customers who placed an order above Rs. 5000
SELECT Name FROM Customers
WHERE CustomerID IN (
SELECT CustomerID FROM Orders WHERE Amount > 5000
);
Used for hierarchical data (e.g., company hierarchy).
Example: Find an employee's chain of managers
WITH EmployeeHierarchy AS (
SELECT EmployeeID, Name, ManagerID
FROM Employees
WHERE EmployeeID = 101 -- Start with an Employee
UNION ALL
SELECT E.EmployeeID, E.Name, E.ManagerID
FROM Employees E
JOIN EmployeeHierarchy EH ON E.ManagerID = EH.EmployeeID
)
SELECT * FROM EmployeeHierarchy;
Database-Specific Behavior
- PostgreSQL / MySQL 8+ -
WITH RECURSIVE
is required. - SQL Server / Oracle -
WITH
alone is sufficient (recursive behavior is inferred). - SQLite -
WITH RECURSIVE
is required.
Window functions like RANK()
or DENSE_RANK()
work with joins for ranking within groups.
Example: Find top 3 highest salaries per department
SELECT E.Name, E.Department, S.Amount,
RANK() OVER (PARTITION BY E.Department ORDER BY S.Amount DESC) AS Rank
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
WHERE Rank <= 3;
SELECT Name FROM Employees WHERE ManagerID IS NULL;
SELECT Customers.Name FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
WHERE Orders.CustomerID IS NULL;
SELECT O.*
FROM Orders O
JOIN (
SELECT CustomerID, MAX(OrderDate) AS LatestOrder
FROM Orders
GROUP BY CustomerID
) Latest ON O.CustomerID = Latest.CustomerID AND O.OrderDate = Latest.LatestOrder;
SELECT Departments.DepartmentName
FROM Departments
LEFT JOIN Employees ON Departments.DepartmentID = Employees.DepartmentID
WHERE Employees.EmployeeID IS NULL;
SELECT Name, COUNT(*)
FROM Customers
GROUP BY Name
HAVING COUNT(*) > 1;
SELECT Name, Department, Salary
FROM (
SELECT Name, Department, Salary,
DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS rnk
FROM Employees
) Ranked
WHERE rnk = 2;
SELECT A.Name, B.Amount
FROM db1.Customers A
JOIN db2.Orders B ON A.CustomerID = B.CustomerID;
- Use indexes
- Use partitioning
- Filter data before joining
- Use EXISTS instead of
IN()
Pagination is useful for displaying large datasets page by page.
Example: Fetching page 2 with 10 records per page
SELECT Customers.Name, Orders.Amount
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
ORDER BY Orders.OrderDate DESC
LIMIT 10 OFFSET 10;
LIMIT 10
- Fetch 10 recordsOFFSET 10
- Skip the first 10
Used to track changes in employee salaries.
SELECT E1.EmployeeID, E1.Name, E1.Salary AS CurrentSalary, E2.Salary AS PreviousSalary
FROM Employees E1
LEFT JOIN EmployeeHistory E2 ON E1.EmployeeID = E2.EmployeeID
WHERE E1.Salary <> E2.Salary;
This compares current salaries (Employees
table) vs. past salaries (EmployeeHistory
table).
A derived table is a subquery used inside a JOIN.
Example: Fetch customers who have spent more than Rs.10,000
SELECT C.Name, Spending.TotalAmount
FROM Customers C
JOIN (
SELECT CustomerID, SUM(Amount) AS TotalAmount
FROM Orders
GROUP BY CustomerID
HAVING SUM(Amount) > 10000
) Spending ON C.CustomerID = Spending.CustomerID;
Create indexes on join columns for faster lookups.
CREATE INDEX idx_customerID ON Orders(CustomerID);
This improves join performance between Customers and Orders.
INNER JOIN
removes non-matching rows.OUTER JOIN
keeps non-matching rows but fills NULLs.
Example: Using WHERE
incorrectly in a LEFT JOIN
SELECT C.Name, O.Amount
FROM Customers C
LEFT JOIN Orders O ON C.CustomerID = O.CustomerID
WHERE O.Amount > 5000; -- This converts it into an INNER JOIN
Instead, filter inside JOIN condition
SELECT C.Name, O.Amount
FROM Customers C
LEFT JOIN Orders O ON C.CustomerID = O.CustomerID AND O.Amount > 5000;
CTEs help break down complex joins into readable parts.
Example: Find employees earning above the department average
WITH AvgSalary AS (
SELECT Department, AVG(Salary) AS DeptAvg
FROM Employees
GROUP BY Department
)
SELECT E.Name, E.Salary, A.DeptAvg
FROM Employees E
JOIN AvgSalary A ON E.Department = A.Department
WHERE E.Salary > A.DeptAvg;
- Denormalization (store combined data in one table)
- Use EXISTS instead of IN
- Use Materialized Views for precomputed joins
- Index foreign keys
Use COALESCE()
to replace NULL values.
SELECT Customers.Name, COALESCE(Orders.Amount, 0) AS OrderAmount
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
If a customer has no orders, it returns 0 instead of NULL.
Denormalization reduces joins by storing redundant data, improving performance.
Example: Instead of joining Orders
and Customers
, store CustomerName
in Orders
.
ALTER TABLE Orders ADD COLUMN CustomerName VARCHAR(255);
UPDATE Orders O
JOIN Customers C ON O.CustomerID = C.CustomerID
SET O.CustomerName = C.Name;
Joins: Faster for large datasets, better performance with indexes.
Subqueries: Simpler to read but can be slower.
Example: Fetching customer names using JOIN vs. Subquery
-- Using JOIN
SELECT C.Name, O.Amount
FROM Customers C
JOIN Orders O ON C.CustomerID = O.CustomerID;
-- Using Subquery
SELECT Name, (SELECT SUM(Amount) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS TotalSpent
FROM Customers;
Materialized Views store the result of a JOIN, making retrieval faster.
CREATE MATERIALIZED VIEW CustomerOrders AS
SELECT C.Name, O.Amount
FROM Customers C
JOIN Orders O ON C.CustomerID = O.CustomerID;
Instead of running the JOIN every time, querying the view is much faster.
Common mistakes and fixes:
Forgetting indexes on join columns - CREATE INDEX idx_column ON Table(Column);
**Using SELECT *** instead of selecting only necessary columns
Using WHERE with LEFT JOIN incorrectly - Use conditions inside ON
Not handling NULL values properly
Random Blogs
- SQL Joins Explained: A Complete Guide with Examples
- How to Start Your Career as a DevOps Engineer
- Extract RGB Color From a Image Using CV2
- Python Challenging Programming Exercises Part 2
- Top 10 Blogs of Digital Marketing you Must Follow
- Datasets for Natural Language Processing
- Python Challenging Programming Exercises Part 1
- Top 10 Knowledge for Machine Learning & Data Science Students
- Mastering Python in 2025: A Complete Roadmap for Beginners
- The Ultimate Guide to Machine Learning (ML) for Beginners