Junction Tables

We have seen examples of one-to-one and one-to-many relationships.

For a social media website like "Tweeter", tweeters and tweets is an example of one-to-many. This is because each user can have many tweets, and each tweet must belong to only 1 user. In this case, the relationship is represented as a foreign key: the user_id column in the tweets table. This representation works because we know each tweet is owned by only 1 user. The same is true for a one-to-one relationship.

Users Can Like Tweets

Our app will have a feature where you can like a tweet, and see all the tweets you've liked. Users liking tweets is another relationship! It's a totally new kind: many-to-many. A user can like many tweets, and a tweet can be liked by many users.

How would we represent this in our tables? Can we add a foreign key in the tweeters or the tweets table? Think back to our other examples of foreign keys. A user_id in tweets works because a tweet belongs to 1 user. A user_id in drivers_licenses works because a license belongs to 1 user. Trying to add some sort of liker_id to tweets or liked_id to tweeters won't work because there are many values.

So what do we do?

Junction Table

We can represent a many-to-many relationship using a separate table, likes, for this relationship.


tweeters
column_name type
id
INTEGER
username
VARCHAR
email
VARCHAR
registered_at
TIMESTAMPTZ

tweets
column_name type reference
id
INTEGER
NULL
body
TEXT
NULL
user_id
INTEGER
tweeters.id
created_at
TIMESTAMPTZ
NULL

likes
column_name type reference
user_id
INTEGER
tweeters.id
tweet_id
INTEGER
tweets.id
created_at
TIMESTAMPTZ
NULL

The likes table is just a table of 2 foreign keys to tweeters and tweets. Each row in likes is a like, and it stores which tweet was liked and who did it. There are many different names for this type of table: a "join table", "cross table", or "junction table".

Table Name: tweets id body 144 how you doin' 267 what's up 83 were on a break Table Name: likes user_id tweet_id 42 144 42 267 71 267 71 83 Table Name: tweeters username id joey 42 ross 71

In this diagram, we have 2 users, @joey and @ross. Joey likes the tweets "how you doin'" and "what's up" while Ross likes "what's up" and "were on a break".

Joining Through the Junction Table

What if we wanted to get all the tweets liked by the user named @hilda.grant53? We need information (the username, to get the right user) from tweeters and from tweets. To know which tweets to get, we need to join through likes. So, we need to join tweeters, likes and tweets, in a triple join.

SELECT tweets.*
FROM tweeters liker
JOIN likes
  ON likes.user_id = liker.id
JOIN tweets
  ON likes.tweet_id = tweets.id
WHERE liker.username = 'hilda.grant53';
id
INTEGER
body
TEXT
user_id
INTEGER
created_at
TIMESTAMPTZ
22
You can't hack the bandwidth without quantifying the auxiliary SQL firewall!
3
2019-12-03 07:27:54.773+00
27
You can't index the driver without connecting the wireless FTP application!
10
2021-04-26 20:48:05.334+00
33
We need to back up the back-end AGP pixel!
10
2021-09-23 20:10:58.765+00
37
generating the protocol won't do anything, we need to back up the virtual USB hard drive!
1
2021-12-20 11:41:28.716+00
35
I'll back up the mobile AI circuit, that should array the EXE hard drive!
9
2021-11-10 23:56:02.234+00

Joining the Same Table Multiple Times

We can get all the tweets that @hilda.grant53 likes. However, we don't know the usernames of who wrote those tweets.

The username column is back in the tweeters table, so to fetch it we'll need to join tweeters again.

SELECT
  tweets.*,
  authors.username AS author_username
FROM tweeters liker
JOIN likes
  ON likes.user_id = liker.id
JOIN tweets
  ON likes.tweet_id = tweets.id
JOIN tweeters authors
  ON authors.id = tweets.user_id
WHERE liker.username = 'hilda.grant53';
id
INTEGER
body
TEXT
user_id
INTEGER
created_at
TIMESTAMPTZ
author_username
VARCHAR
22
You can't hack the bandwidth without quantifying the auxiliary SQL firewall!
3
2019-12-03 07:27:54.773+00
brisa_jenkins39
27
You can't index the driver without connecting the wireless FTP application!
10
2021-04-26 20:48:05.334+00
madge.wehner
33
We need to back up the back-end AGP pixel!
10
2021-09-23 20:10:58.765+00
madge.wehner
37
generating the protocol won't do anything, we need to back up the virtual USB hard drive!
1
2021-12-20 11:41:28.716+00
deonte_murray45
35
I'll back up the mobile AI circuit, that should array the EXE hard drive!
9
2021-11-10 23:56:02.234+00
enrico.durgan1

This can get confusing. The tweeters table needs to be referenced twice because the relationship of the users to the other rows in the query are different for each reference. We had to get tweets liked by a certain user, so tweeters represents "people who liked tweets" at first. But then we also reference tweeters to get "people who wrote tweets" which is a different set of users.

Each time we reference the tweeters table we should alias it appropriately. Think of each alias as a separate instance of data from the underlying table, and you can use those aliases to specify which instance you query. In the last query, we get authors.username to specify that we aren't looking for username of the liker, @hilda.grant53.

© 2022 Andrew Carlson. All rights reserved.