UNION

We learned that the JOIN is one way to bring multiple tables together—fused side-by-side.

UNION is another way. It fuses tables top-to-bottom.

Suppose you're working on an online education platform similar to Khan Academy, called "Con Academy". Similar to Khan Academy, you have teachers publishing lessons and assignments online for any student around the world to enroll in at any time.

The students' info is stored in the students table.


students
column_name type
id
VARCHAR
name
VARCHAR
email
VARCHAR
city
VARCHAR
gpa
NUMERIC

You want to monitor the students who are excelling, and also the ones who may need additional help. So you need to query the top 3 and bottom 3 students by GPA.

Of course, we know how to do that with 2 separate queries.

SELECT *
FROM students
WHERE gpa IS NOT NULL
ORDER BY gpa DESC
LIMIT 3;
id
VARCHAR
name
VARCHAR
email
VARCHAR
city
VARCHAR
gpa
NUMERIC
53463534
Damian Reynolds
Damian.Reynolds@hotmail.com
New York
3.996
17296531
Aurelio Grant
Aurelio.Grant@gmail.com
Philadelphia
3.907
74726042
Alphonso Raynor
Alphonso.Raynor@yahoo.com
San Francisco
3.899
SELECT *
FROM students
WHERE gpa IS NOT NULL
ORDER BY gpa
LIMIT 3;
id
VARCHAR
name
VARCHAR
email
VARCHAR
city
VARCHAR
gpa
NUMERIC
83063179
Eloisa Buckridge
Eloisa_Buckridge@gmail.com
New Orleans
1.513
36022065
Evangeline Herzog
Evangeline46@gmail.com
Pheonix
1.519
81865678
Dylan Williamson
Dylan_Williamson60@yahoo.com
NULL
1.541

Note that we disregard students where gpa is NULL.

What if we wanted to do this with 1 single query? That's what UNION does. It takes 2 whole queries and stacks their results into 1 table altogether.

(
  SELECT *
  FROM students
  WHERE gpa IS NOT NULL
  ORDER BY gpa DESC
  LIMIT 3
)
UNION
(
  SELECT *
  FROM students
  WHERE gpa IS NOT NULL
  ORDER BY gpa
  LIMIT 3
);
id
VARCHAR
name
VARCHAR
email
VARCHAR
city
VARCHAR
gpa
NUMERIC
81865678
Dylan Williamson
Dylan_Williamson60@yahoo.com
NULL
1.541
17296531
Aurelio Grant
Aurelio.Grant@gmail.com
Philadelphia
3.907
83063179
Eloisa Buckridge
Eloisa_Buckridge@gmail.com
New Orleans
1.513
53463534
Damian Reynolds
Damian.Reynolds@hotmail.com
New York
3.996
36022065
Evangeline Herzog
Evangeline46@gmail.com
Pheonix
1.519
74726042
Alphonso Raynor
Alphonso.Raynor@yahoo.com
San Francisco
3.899

You just need 2 separate queries with the UNION keyword between them.

There is an important constraint, though. The 2 tables must have the same number of columns, and the data types of the corresponding columns have to match; integers on top of integers, text on top of text, etc. Otherwise, you get an error.

(
  SELECT id, email, gpa
  FROM students
  WHERE gpa IS NOT NULL
  ORDER BY gpa DESC
  LIMIT 3
)
UNION
(
  SELECT id, email, name, gpa
  FROM students
  WHERE gpa IS NOT NULL
  ORDER BY gpa
  LIMIT 3
);
error: each UNION query must have the same number of columns

UNIONing Different Tables

It is possible to UNION any queries that produce tables with the same number of columns and data types.

Suppose we are helping the marketing department at Con Academy. They want to advertise that the students and teachers are all around the world. They want you to give them a list of cities that the students and teachers live in. They also don't want duplicates to appear in the list.


teachers
column_name type
id
VARCHAR
name
VARCHAR
email
VARCHAR
subject
VARCHAR
city
VARCHAR
SELECT city
FROM students
WHERE city IS NOT NULL
UNION
SELECT city
FROM teachers;
city
VARCHAR
Pheonix
New Orleans
San Francisco
Chicago
Seattle
New York
Philadelphia
Houston
Los Angeles
Atlanta
Miami

It turns out that UNION automatically removes duplicates, so that's no problem!

If for some reason you need to allow duplicates, you can replace UNION with UNION ALL.

Other Set Operations

You can think of UNION as a combination of the rows from 2 tables. It is also possible to subtract rows seen in one table from another table with EXCEPT.

If you wanted a list of cities with students but no teachers:

SELECT city
FROM students
WHERE city IS NOT NULL
EXCEPT
SELECT city
FROM teachers;
city
VARCHAR
Chicago
Pheonix
New York
Philadelphia
Atlanta

There is also a way to get rows that only appear in both tables with INTERSECT. Here are the cities with both students and teachers.

SELECT city
FROM students
WHERE city IS NOT NULL
INTERSECT
SELECT city
FROM teachers;
city
VARCHAR
Houston
Seattle
Los Angeles
New Orleans
San Francisco
Miami

© 2022 Andrew Carlson. All rights reserved.