A subquery is simply query inside a query. Nested. While JOIN is like a merger of two tables to form a new table virtually. It combines data from two different tables or the table itself(self-join).
We can get many different results from using Subquery and JOIN which, many times, can't be done by simple SQL query. But most of the subqueries can be rewritten using JOIN and vice versa. Let's take a look at the examples.
Let's take a look at the examples to understand this better.
//JOIN example
SELECT invoice_number, invoice_date, invoice_total FROM invoices JOIN sellers
ON invoices.seller_id = sellers.seller_id
WHERE seller_state='FL'
ORDER BY invoice_date;
The query above is fetching the number, date and total for each invoices and joins with sellers on sellers_id. Also, we have a condition in WHERE which will limit the result set to Florida sellers only.
Now take a look the following query.
//Subquery example
SELECT invoice_number, invoice_date, invoice_total FROM invoices
WHERE seller_id IN
(SELECT seller_id FROM sellers WHERE seller_state= 'FL')
ORDER BY invoice_date;
In the first query I have used JOIN to combine the sellers and invoices data and then I am checking for the state using WHERE condition.
In the second query, I have added a subquery in which it will only return the seller_id of Florida sellers. So, in that query I am fetching the data for only Florida sellers from invoices using WHERE condition.
So, what should we use?
I think we should use the query based on the complexity and relationship between the tables as well. But that's my opinion.
JOIN is more helpful when there is a relationship between tables like the query above. They both have seller_id column in common. Subquery is helpful whether there is a relation or not (ad hoc relationship we can say).
For complex queries subqueries are easier to use and divide the query in small parts and analyze from human eyes (more readable, more understandable). We don't always find relationships between all tables in complex queries so using subqueries might be a good idea.
Advantages of JOIN
- It included columns in result set from both tables.
- More helpful/intuitive when there is a relationship between tables.
Advantages of Subquery
- We can pass aggregate value using subquery
- More helpful/intuitive when there is no relationship or more like ad hoc relationship
- It makes complex and lengthy queries easier to write, more readable and understandable
comparison JOIN Subquery