A JOIN command in SQL is used to combine data or rows from two or more tables based on a common field between them. Primarily, there are four main joins in SQL:
However, in this article, I’m also going to discuss the following joins and operators:
Every topic is supplemented by a visualisation. At the end, there is also a table that mentions the differences between JOIN and UNION; INNER JOIN and INTERSECT. All commands have been performed in MySQL.
I’ll be using the following two tables in the examples to explain each topic:
employees
salary
1. INNER JOIN
The INNER JOIN keyword will create the resultant table by combining all rows from both the tables where the condition satisfies i.e., value of the common field will be same.
Example:
Let’s query the employee names (emp_name
) and their respective salaries (salary
) for values of emp_id
that are same in both the tables.
Note: employees.emp_id
and salary.emp_id
are the matching key columns.
SELECT e.emp_name, s.salary FROM employees AS eINNER JOINsalary AS sON e.emp_id=s.emp_id;
We get the following output:
As you can see, the output is data containing all rows of the values of emp_id
that are present in both employees and salary tables. The following image outlines the working process of INNER JOIN in this example.
Since the name of matching key column is same in both the tables, i.e. emp_id
, we can also employ the USING command instead of ON. Note that you have to use parenthesis () after the USING keyword to mention the name of the matching key column.
SELECT e.emp_name, s.salary FROM employees AS eINNER JOINsalary AS sUSING (emp_id);
We will get the same output.
2. LEFT JOIN
The LEFT JOIN keyword will create a table that contains all the rows of the left side table and matching rows of the right-side table. The rows for which there is no matching row on right side, the resultant table will contain NULL
. LEFT JOIN is also known as LEFT OUTER JOIN.
Example:
Let’s query employee names (emp_name
) and their corresponding salaries (salary
) from the employees and salary tables respectively.
SELECT e.emp_name, s.salary FROM employees AS eLEFT JOINsalary AS sON e.emp_id=s.emp_id;
We get the following output:
Since there are some emp_id
s which are present on the left side table but not the right side table, few salary values are missing or NULL
.
What happens if there is more than one row of data on the right side table, for the same matching id on the left side table? All the corresponding values in the right side table get printed.
3. RIGHT JOIN
The RIGHT JOIN keyword will create resultant table that contains all the rows of the right-side table and matching rows of the left-side table. The rows for which there is no matching row on the left side, the resultant table will contain NULL
. RIGHT JOIN is also known as RIGHT OUTER JOIN.
Example:
Let’s query salaries (salary
) and their corresponding employee names (emp_name
) from the salary and employees tables respectively.
SELECT e.emp_name, s.salary FROM employees AS eRIGHT JOINsalary AS sON e.emp_id=s.emp_id;
We get the following output:
Since there are some emp_id
s which are present on the right side table (salary
) but not the left side table (employees
) , few employee names are missing or NULL
.
What happens if there is more than one row of data on the left side table, for the same matching id on the right side table? Similar to the explanation of this case in LEFT JOIN, all the corresponding values in the left side table get printed.
4. FULL JOIN
FULL JOIN keyword will create the resultant table by combining result of both LEFT JOIN and RIGHT JOIN. The table will contain all the rows from both the tables. The rows for which there is no matching, the result-set will contain NULL
values. FULL JOIN is also known as FULL OUTER JOIN.
Example
Let’s query all employee names (emp_name
) and all salaries (salary
) from both the given tables.
SELECT e.emp_name, s.salary FROM employees AS eFULL JOINsalary AS sON e.emp_id=s.emp_id;
We get the following output:
Now that we’ve covered the four basic joins in SQL, let’s move on to the more uncommon ones.
5. CROSS JOIN
CROSS JOIN keyword is used to combine each row of the left table with each row of the right table. It is also known as the Cartesian join since it returns the Cartesian product of the sets of rows from the joined tables.
Unlike other joins, CROSS JOIN doesn’t require a matching key column as all possible combinations are printed. I won’t illustrate CROSS JOIN with an example using the given tables because the information would be inaccurate, i.e. a single employee will have multiple salaries, which is not possible. However, to give a brief idea about the keyword, an input code has been mentioned below.
SELECT * FROM employees AS eCROSS JOINsalary AS s;
6. SELF JOIN
The SELF JOIN keyword will create resultant table by joining a table to itself. Each row of a table is combined with itself and with every other row of the table. SELF JOIN is used to extract specific queries and hence does not have a universal working process. At least one table must be renamed in the query.
Example:
Let’s query employee’s name and their manager’s name through SELF JOIN.
SELECT a1.emp_name AS Employee, a2.emp_name AS ManagerFROM employees AS a1JOIN employees AS a2ON a1.manager_id=a2.emp_idORDER BY a1.emp_name ASC;
We get the following output:
Now that we’ve covered all the JOINs, let’s move on to Operators in SQL.
7. UNION
The UNION keyword is used to combine two separate SELECT statements and produce the resultant table as a merger of both the SELECT statements. The fields to be used in both the select statements must be in same order, same quantity and same data type.
Example
SELECT emp_id FROM employeesUNIONSELECT emp_id FROM salary;
We get the following output:
8. UNION ALL
The UNION ALL keyword is also used to combine two separate SELECT statements and produce the resultant table as a merger of both the SELECT statements. As mentioned for UNION, the fields to be used in both the select statements must be in same order, same quantity and same data type.
The only difference between UNION and UNION ALL is that they return distinct and duplicate values respectively. As a result, if you want to query distinct values, use UNION. If you also want to include the duplicate entries, use UNION ALL.
Example:
SELECT emp_id FROM employeesUNION ALLSELECT emp_id FROM salary;
We get the following output:
9. INTERSECT
INTERSECT keyword provides the result of the cohesion of two SELECT statements. The resultant table will contain all the rows which are common to both the SELECT statements. It will not return any duplicate values.
Example:
SELECT emp_id FROM employeesINTERSECTSELECT emp_id FROM salary;
We get the following output:
10. EXCEPT
EXCEPT keyword provides the result of the inconsistency of two SELECT statements. The resultant table will contain all the rows except the common rows of the two SELECT statements.
Example:
SELECT emp_id FROM employeesEXCEPTSELECT emp_id FROM salary;
We get the following output:
Now that we’ve covered all the JOINs and operators in SQL, let’s discuss some differences between topics that may seem similar. For instance, JOIN v/s UNION and INNER JOIN v/s INTERSECT.
JOIN v/s UNION
INNER JOIN v/s INTERSECT
I hope that this article clarified your concepts on the various types of JOINs in SQL; as well as some operators. This is a fairly important subject with respect to Data Science Interviews and thus, it’s essential to be familiar with the aforementioned topics.