Frames
We learned that PARTITION BY
is one way to limit which rows can be seen in a window for a window function.
It limits a row's window to only the rows where the value(s) in named in PARTITION BY
are equal.
These can be column values or computed ones.
Sometimes, a window needs to be determined by something more complex than just the equality of value(s). This is where frames are useful.
Suppose you're a software engineer working for a bank called "Goliath National Bank".
The mobile engineers are building an app to show the users their account balances.
You have a table called checking_account_activity
which shows the transactions for each account.
column_name | type |
---|---|
id | UUID |
account_id | VARCHAR |
created_at | TIMESTAMPTZ |
amount_usd | NUMERIC |
exchanged_with | VARCHAR |
The values in the amount_usd
column can be positive or negative dollar amounts.
Positive values are incoming payments or deposits into the account, and negative amounts are outgoing payments or withdrawals from the account.
Since we don't have account balances stored in any tables directly, we have to calculate them from their transaction history.
If we want to do this for every user, we can just take a sum of all of their transactions, and use a familiar GROUP BY account_id
.
SELECT
account_id,
sum(amount_usd) AS account_balance
FROM checking_account_activity
GROUP BY account_id;
account_id VARCHAR | account_balance NUMERIC |
---|---|
7052379077 | 5585.49 |
2512254032 | 4057.26 |
5878764270 | 3972.01 |
Goliath National Bank isn't so big yet. They only have 3 account holders as of now.
Windows Through Time
This is great. We have the account balances being calculated for the users.
But the users want more! They want to see their transaction history plus their account balance after each transaction.
Since we need to display the transaction history, we need to preserve each row in checking_account_activity
,
and therefore we cannot aggregate à la GROUP BY
.
Using sum()
as a window function can help us here since it will preserve the individual rows.
But what should the window be?
Let's just start with what we know so far.
SELECT
account_id,
created_at,
exchanged_with,
amount_usd,
sum(amount_usd) OVER (
PARTITION BY account_id
)
FROM checking_account_activity
ORDER BY account_id, created_at;
account_id VARCHAR | created_at TIMESTAMPTZ | exchanged_with VARCHAR | amount_usd NUMERIC | sum NUMERIC |
---|---|---|---|---|
2512254032 | 2022-04-01 06:35:31+00 | NULL | 1000.00 | 4057.26 |
2512254032 | 2022-04-02 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 4057.26 |
2512254032 | 2022-04-02 11:38:58+00 | Aufderhar - Ferry | -619.03 | 4057.26 |
2512254032 | 2022-04-03 04:49:11+00 | Hauck, Bartoletti and Kemmer | -862.52 | 4057.26 |
2512254032 | 2022-04-03 09:41:17+00 | Price, Fadel and Hermiston | -394.16 | 4057.26 |
2512254032 | 2022-04-07 10:35:57+00 | Corkery, Botsford and Pacocha | -304.78 | 4057.26 |
2512254032 | 2022-04-10 03:19:38+00 | Reinger Group | -93.27 | 4057.26 |
2512254032 | 2022-04-10 23:16:42+00 | Cronin, Stark and Murazik | -357.68 | 4057.26 |
2512254032 | 2022-04-12 19:44:22+00 | Waelchi, Quigley and Barton | -146.78 | 4057.26 |
2512254032 | 2022-04-14 08:16:23+00 | Gaylord - Prohaska | -88.15 | 4057.26 |
2512254032 | 2022-04-15 07:49:20+00 | Durgan - Pollich | 208.65 | 4057.26 |
2512254032 | 2022-04-15 15:48:38+00 | Shields Inc | -487.96 | 4057.26 |
2512254032 | 2022-04-16 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 4057.26 |
2512254032 | 2022-04-16 02:41:58+00 | Hermiston, Shanahan and Walker | -234.96 | 4057.26 |
2512254032 | 2022-04-17 01:04:45+00 | Vandervort Group | -22.38 | 4057.26 |
2512254032 | 2022-04-22 18:36:51+00 | Fahey, Emmerich and Kub | 166.09 | 4057.26 |
2512254032 | 2022-04-24 20:42:15+00 | Littel - Witting | -732.83 | 4057.26 |
2512254032 | 2022-04-24 20:43:33+00 | Durgan - Schmitt | -82.89 | 4057.26 |
2512254032 | 2022-04-24 22:22:31+00 | Hettinger Group | -859.72 | 4057.26 |
2512254032 | 2022-04-29 00:22:37+00 | Wiza - Ritchie | -72.27 | 4057.26 |
2512254032 | 2022-04-29 14:00:57+00 | Hessel, Wuckert and Fritsch | 24.16 | 4057.26 |
2512254032 | 2022-04-30 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 4057.26 |
5878764270 | 2022-04-01 15:33:03+00 | NULL | 3000.00 | 3972.01 |
5878764270 | 2022-04-02 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 3972.01 |
5878764270 | 2022-04-05 06:16:08+00 | Pacocha - Bernhard | -111.38 | 3972.01 |
5878764270 | 2022-04-05 16:31:33+00 | Rath Group | -99.34 | 3972.01 |
5878764270 | 2022-04-07 05:14:46+00 | Bosco, Kiehn and Leannon | 239.33 | 3972.01 |
5878764270 | 2022-04-07 10:17:57+00 | Weimann, Hauck and Ritchie | -222.87 | 3972.01 |
5878764270 | 2022-04-11 17:06:34+00 | Steuber - Hermann | -189.83 | 3972.01 |
5878764270 | 2022-04-13 20:38:46+00 | Kunde Inc | -22.90 | 3972.01 |
5878764270 | 2022-04-16 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 3972.01 |
5878764270 | 2022-04-16 01:21:12+00 | Oberbrunner, Gerlach and Sawayn | 95.78 | 3972.01 |
5878764270 | 2022-04-16 02:13:24+00 | Kuphal Group | -234.24 | 3972.01 |
5878764270 | 2022-04-16 07:54:19+00 | Schmitt - Howell | -205.04 | 3972.01 |
5878764270 | 2022-04-25 21:33:52+00 | ATM withdrawal | -220.00 | 3972.01 |
5878764270 | 2022-04-26 14:40:52+00 | ATM withdrawal | -280.00 | 3972.01 |
5878764270 | 2022-04-30 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 3972.01 |
5878764270 | 2022-04-30 14:37:51+00 | Runolfsdottir LLC | -211.43 | 3972.01 |
7052379077 | 2022-04-01 13:10:00+00 | NULL | 3000.00 | 5585.49 |
7052379077 | 2022-04-02 00:00:00+00 | Franey LLC | 1027.23 | 5585.49 |
7052379077 | 2022-04-07 23:04:01+00 | Gulgowski Inc | 5.59 | 5585.49 |
7052379077 | 2022-04-09 20:25:43+00 | Hilpert, Greenfelder and Kiehn | 204.77 | 5585.49 |
7052379077 | 2022-04-10 07:38:47+00 | Greenfelder - Haag | -239.19 | 5585.49 |
7052379077 | 2022-04-13 14:20:47+00 | ATM withdrawal | -220.00 | 5585.49 |
7052379077 | 2022-04-14 11:46:34+00 | ATM withdrawal | -180.00 | 5585.49 |
7052379077 | 2022-04-16 00:00:00+00 | Franey LLC | 1027.23 | 5585.49 |
7052379077 | 2022-04-20 12:33:23+00 | Batz - Cole | 193.12 | 5585.49 |
7052379077 | 2022-04-21 00:36:10+00 | Hyatt - Ratke | -230.64 | 5585.49 |
7052379077 | 2022-04-26 00:55:05+00 | Okuneva - Lebsack | -29.85 | 5585.49 |
7052379077 | 2022-04-30 00:00:00+00 | Franey LLC | 1027.23 | 5585.49 |
This gives us the transaction history plus the current account balance. However, we don't see what the balance was at the time of each transaction. This is because the window for each transaction row includes is erroneously including the future rows. Instead, we want to limit the window for each transaction to be all transactions from the past and not the future. This is done with a frame. A frame is another way to filter down the window within a partition. A frame definition looks like this:
ROWS BETWEEN (frame_start) AND (frame_end)
The (frame_start)
can be one of:
CURRENT ROW
: starts the frame at the current row.n PRECEDING
, wheren
is an integer: starts the frame at n rows before the current row.UNBOUNDED PRECEDING
: includes all rows before the current row.
The (frame_end)
can be one of:
CURRENT ROW
: ends the frame at the current row.n FOLLOWING
, wheren
is an integer: ends the frame at n rows after the current row.UNBOUNDED FOLLOWING
: includes all rows after the current row.
That limits the window to only see rows that fall within the range specified by the frame.
This is what we want in our case:
SELECT
account_id,
created_at,
exchanged_with,
amount_usd,
sum(amount_usd) OVER (
PARTITION BY account_id ORDER BY created_at
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS account_balance
FROM checking_account_activity
ORDER BY account_id, created_at;
account_id VARCHAR | created_at TIMESTAMPTZ | exchanged_with VARCHAR | amount_usd NUMERIC | account_balance NUMERIC |
---|---|---|---|---|
2512254032 | 2022-04-01 06:35:31+00 | NULL | 1000.00 | 1000.00 |
2512254032 | 2022-04-02 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 3672.58 |
2512254032 | 2022-04-02 11:38:58+00 | Aufderhar - Ferry | -619.03 | 3053.55 |
2512254032 | 2022-04-03 04:49:11+00 | Hauck, Bartoletti and Kemmer | -862.52 | 2191.03 |
2512254032 | 2022-04-03 09:41:17+00 | Price, Fadel and Hermiston | -394.16 | 1796.87 |
2512254032 | 2022-04-07 10:35:57+00 | Corkery, Botsford and Pacocha | -304.78 | 1492.09 |
2512254032 | 2022-04-10 03:19:38+00 | Reinger Group | -93.27 | 1398.82 |
2512254032 | 2022-04-10 23:16:42+00 | Cronin, Stark and Murazik | -357.68 | 1041.14 |
2512254032 | 2022-04-12 19:44:22+00 | Waelchi, Quigley and Barton | -146.78 | 894.36 |
2512254032 | 2022-04-14 08:16:23+00 | Gaylord - Prohaska | -88.15 | 806.21 |
2512254032 | 2022-04-15 07:49:20+00 | Durgan - Pollich | 208.65 | 1014.86 |
2512254032 | 2022-04-15 15:48:38+00 | Shields Inc | -487.96 | 526.90 |
2512254032 | 2022-04-16 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 3199.48 |
2512254032 | 2022-04-16 02:41:58+00 | Hermiston, Shanahan and Walker | -234.96 | 2964.52 |
2512254032 | 2022-04-17 01:04:45+00 | Vandervort Group | -22.38 | 2942.14 |
2512254032 | 2022-04-22 18:36:51+00 | Fahey, Emmerich and Kub | 166.09 | 3108.23 |
2512254032 | 2022-04-24 20:42:15+00 | Littel - Witting | -732.83 | 2375.40 |
2512254032 | 2022-04-24 20:43:33+00 | Durgan - Schmitt | -82.89 | 2292.51 |
2512254032 | 2022-04-24 22:22:31+00 | Hettinger Group | -859.72 | 1432.79 |
2512254032 | 2022-04-29 00:22:37+00 | Wiza - Ritchie | -72.27 | 1360.52 |
2512254032 | 2022-04-29 14:00:57+00 | Hessel, Wuckert and Fritsch | 24.16 | 1384.68 |
2512254032 | 2022-04-30 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 4057.26 |
5878764270 | 2022-04-01 15:33:03+00 | NULL | 3000.00 | 3000.00 |
5878764270 | 2022-04-02 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 3811.31 |
5878764270 | 2022-04-05 06:16:08+00 | Pacocha - Bernhard | -111.38 | 3699.93 |
5878764270 | 2022-04-05 16:31:33+00 | Rath Group | -99.34 | 3600.59 |
5878764270 | 2022-04-07 05:14:46+00 | Bosco, Kiehn and Leannon | 239.33 | 3839.92 |
5878764270 | 2022-04-07 10:17:57+00 | Weimann, Hauck and Ritchie | -222.87 | 3617.05 |
5878764270 | 2022-04-11 17:06:34+00 | Steuber - Hermann | -189.83 | 3427.22 |
5878764270 | 2022-04-13 20:38:46+00 | Kunde Inc | -22.90 | 3404.32 |
5878764270 | 2022-04-16 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 4215.63 |
5878764270 | 2022-04-16 01:21:12+00 | Oberbrunner, Gerlach and Sawayn | 95.78 | 4311.41 |
5878764270 | 2022-04-16 02:13:24+00 | Kuphal Group | -234.24 | 4077.17 |
5878764270 | 2022-04-16 07:54:19+00 | Schmitt - Howell | -205.04 | 3872.13 |
5878764270 | 2022-04-25 21:33:52+00 | ATM withdrawal | -220.00 | 3652.13 |
5878764270 | 2022-04-26 14:40:52+00 | ATM withdrawal | -280.00 | 3372.13 |
5878764270 | 2022-04-30 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 4183.44 |
5878764270 | 2022-04-30 14:37:51+00 | Runolfsdottir LLC | -211.43 | 3972.01 |
7052379077 | 2022-04-01 13:10:00+00 | NULL | 3000.00 | 3000.00 |
7052379077 | 2022-04-02 00:00:00+00 | Franey LLC | 1027.23 | 4027.23 |
7052379077 | 2022-04-07 23:04:01+00 | Gulgowski Inc | 5.59 | 4032.82 |
7052379077 | 2022-04-09 20:25:43+00 | Hilpert, Greenfelder and Kiehn | 204.77 | 4237.59 |
7052379077 | 2022-04-10 07:38:47+00 | Greenfelder - Haag | -239.19 | 3998.40 |
7052379077 | 2022-04-13 14:20:47+00 | ATM withdrawal | -220.00 | 3778.40 |
7052379077 | 2022-04-14 11:46:34+00 | ATM withdrawal | -180.00 | 3598.40 |
7052379077 | 2022-04-16 00:00:00+00 | Franey LLC | 1027.23 | 4625.63 |
7052379077 | 2022-04-20 12:33:23+00 | Batz - Cole | 193.12 | 4818.75 |
7052379077 | 2022-04-21 00:36:10+00 | Hyatt - Ratke | -230.64 | 4588.11 |
7052379077 | 2022-04-26 00:55:05+00 | Okuneva - Lebsack | -29.85 | 4558.26 |
7052379077 | 2022-04-30 00:00:00+00 | Franey LLC | 1027.23 | 5585.49 |
Note that we also added an ORDER BY
to the window which is necessary because created_at
determines the order.
At each row, the window sees all past rows up until the current row.
Then, sum(amount_usd)
gets the sum, or account balance up to that point in time.
Default Frame
It turns out that the frame definition is optional here.
All that's needed is the ORDER BY
.
SELECT
account_id,
created_at,
exchanged_with,
amount_usd,
sum(amount_usd) OVER (
PARTITION BY account_id ORDER BY created_at
) AS account_balance
FROM checking_account_activity
ORDER BY account_id, created_at;
account_id VARCHAR | created_at TIMESTAMPTZ | exchanged_with VARCHAR | amount_usd NUMERIC | account_balance NUMERIC |
---|---|---|---|---|
2512254032 | 2022-04-01 06:35:31+00 | NULL | 1000.00 | 1000.00 |
2512254032 | 2022-04-02 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 3672.58 |
2512254032 | 2022-04-02 11:38:58+00 | Aufderhar - Ferry | -619.03 | 3053.55 |
2512254032 | 2022-04-03 04:49:11+00 | Hauck, Bartoletti and Kemmer | -862.52 | 2191.03 |
2512254032 | 2022-04-03 09:41:17+00 | Price, Fadel and Hermiston | -394.16 | 1796.87 |
2512254032 | 2022-04-07 10:35:57+00 | Corkery, Botsford and Pacocha | -304.78 | 1492.09 |
2512254032 | 2022-04-10 03:19:38+00 | Reinger Group | -93.27 | 1398.82 |
2512254032 | 2022-04-10 23:16:42+00 | Cronin, Stark and Murazik | -357.68 | 1041.14 |
2512254032 | 2022-04-12 19:44:22+00 | Waelchi, Quigley and Barton | -146.78 | 894.36 |
2512254032 | 2022-04-14 08:16:23+00 | Gaylord - Prohaska | -88.15 | 806.21 |
2512254032 | 2022-04-15 07:49:20+00 | Durgan - Pollich | 208.65 | 1014.86 |
2512254032 | 2022-04-15 15:48:38+00 | Shields Inc | -487.96 | 526.90 |
2512254032 | 2022-04-16 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 3199.48 |
2512254032 | 2022-04-16 02:41:58+00 | Hermiston, Shanahan and Walker | -234.96 | 2964.52 |
2512254032 | 2022-04-17 01:04:45+00 | Vandervort Group | -22.38 | 2942.14 |
2512254032 | 2022-04-22 18:36:51+00 | Fahey, Emmerich and Kub | 166.09 | 3108.23 |
2512254032 | 2022-04-24 20:42:15+00 | Littel - Witting | -732.83 | 2375.40 |
2512254032 | 2022-04-24 20:43:33+00 | Durgan - Schmitt | -82.89 | 2292.51 |
2512254032 | 2022-04-24 22:22:31+00 | Hettinger Group | -859.72 | 1432.79 |
2512254032 | 2022-04-29 00:22:37+00 | Wiza - Ritchie | -72.27 | 1360.52 |
2512254032 | 2022-04-29 14:00:57+00 | Hessel, Wuckert and Fritsch | 24.16 | 1384.68 |
2512254032 | 2022-04-30 00:00:00+00 | Jacobson, Kuphal and Kessler | 2672.58 | 4057.26 |
5878764270 | 2022-04-01 15:33:03+00 | NULL | 3000.00 | 3000.00 |
5878764270 | 2022-04-02 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 3811.31 |
5878764270 | 2022-04-05 06:16:08+00 | Pacocha - Bernhard | -111.38 | 3699.93 |
5878764270 | 2022-04-05 16:31:33+00 | Rath Group | -99.34 | 3600.59 |
5878764270 | 2022-04-07 05:14:46+00 | Bosco, Kiehn and Leannon | 239.33 | 3839.92 |
5878764270 | 2022-04-07 10:17:57+00 | Weimann, Hauck and Ritchie | -222.87 | 3617.05 |
5878764270 | 2022-04-11 17:06:34+00 | Steuber - Hermann | -189.83 | 3427.22 |
5878764270 | 2022-04-13 20:38:46+00 | Kunde Inc | -22.90 | 3404.32 |
5878764270 | 2022-04-16 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 4215.63 |
5878764270 | 2022-04-16 01:21:12+00 | Oberbrunner, Gerlach and Sawayn | 95.78 | 4311.41 |
5878764270 | 2022-04-16 02:13:24+00 | Kuphal Group | -234.24 | 4077.17 |
5878764270 | 2022-04-16 07:54:19+00 | Schmitt - Howell | -205.04 | 3872.13 |
5878764270 | 2022-04-25 21:33:52+00 | ATM withdrawal | -220.00 | 3652.13 |
5878764270 | 2022-04-26 14:40:52+00 | ATM withdrawal | -280.00 | 3372.13 |
5878764270 | 2022-04-30 00:00:00+00 | Paucek, Tremblay and Conn | 811.31 | 4183.44 |
5878764270 | 2022-04-30 14:37:51+00 | Runolfsdottir LLC | -211.43 | 3972.01 |
7052379077 | 2022-04-01 13:10:00+00 | NULL | 3000.00 | 3000.00 |
7052379077 | 2022-04-02 00:00:00+00 | Franey LLC | 1027.23 | 4027.23 |
7052379077 | 2022-04-07 23:04:01+00 | Gulgowski Inc | 5.59 | 4032.82 |
7052379077 | 2022-04-09 20:25:43+00 | Hilpert, Greenfelder and Kiehn | 204.77 | 4237.59 |
7052379077 | 2022-04-10 07:38:47+00 | Greenfelder - Haag | -239.19 | 3998.40 |
7052379077 | 2022-04-13 14:20:47+00 | ATM withdrawal | -220.00 | 3778.40 |
7052379077 | 2022-04-14 11:46:34+00 | ATM withdrawal | -180.00 | 3598.40 |
7052379077 | 2022-04-16 00:00:00+00 | Franey LLC | 1027.23 | 4625.63 |
7052379077 | 2022-04-20 12:33:23+00 | Batz - Cole | 193.12 | 4818.75 |
7052379077 | 2022-04-21 00:36:10+00 | Hyatt - Ratke | -230.64 | 4588.11 |
7052379077 | 2022-04-26 00:55:05+00 | Okuneva - Lebsack | -29.85 | 4558.26 |
7052379077 | 2022-04-30 00:00:00+00 | Franey LLC | 1027.23 | 5585.49 |
This produces the same result because ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
is the default frame when an aggregate function has a window with ORDER BY
.
It is just assumed.
If ORDER BY
is used in the window of an aggregate function, you only need to specify a frame if it is different than the default.
Frame Modes
The examples we saw here define the frame in "row mode" where the frame definition begins with ROWS
.
There are 2 other modes possible: GROUPS
and RANGE
.
These are alternative ways to specify a frame.
To briefly define the modes here…
ROWS
- Countsn
rows preceding and following.GROUPS
- LikeROWS
except that it countsn
groups. A "group" is a set of rows that have the same value for what is specified in theORDER BY
, i.e. rows with tied values are counted together as 1.RANGE
- Sees all rows where the value for what is specific inORDER BY
is between the range specified by the frame start and frame end. For example, if the window orders by a date, the frame could be between 2INTERVAL
valuesRANGE BETWEEN '1 day' PRECEDING AND '10 days' FOLLOWING
. The values get automaticallyCAST
ed here.