PostgreSQL Summary Stats and Window Functions
Michel Semaan
Data Scientist
LAG(column, n)
returns column
's value at the row n
rows before the current rowLEAD(column, n)
returns column
's value at the row n
rows after the current rowFIRST_VALUE(column)
returns the first value in the table or partitionLAST_VALUE(column)
returns the last value in the table or partitionQuery
WITH Hosts AS (
SELECT DISTINCT Year, City
FROM Summer_Medals)
SELECT
Year, City,
LEAD(City, 1) OVER (ORDER BY Year ASC)
AS Next_City,
LEAD(City, 2) OVER (ORDER BY Year ASC)
AS After_Next_City
FROM Hosts
ORDER BY Year ASC;
Result
| Year | City | Next_City | After_Next_City |
|------|-----------|-----------|-----------------|
| 1896 | Athens | Paris | St Louis |
| 1900 | Paris | St Louis | London |
| 1904 | St Louis | London | Stockholm |
| 1908 | London | Stockholm | Antwerp |
| 1912 | Stockholm | Antwerp | Paris |
| ... | ... | ... | ... |
Query
SELECT
Year, City,
FIRST_VALUE(City) OVER
(ORDER BY Year ASC) AS First_City,
LAST_VALUE(City) OVER (
ORDER BY Year ASC
RANGE BETWEEN
UNBOUNDED PRECEDING AND
UNBOUNDED FOLLOWING
) AS Last_City
FROM Hosts
ORDER BY Year ASC;
Result
| Year | City | First_City | Last_City |
|------|-----------|------------|-----------------|
| 1896 | Athens | Athens | London |
| 1900 | Paris | Athens | London |
| 1904 | St Louis | Athens | London |
| 1908 | London | Athens | London |
| 1912 | Stockholm | Athens | London |
RANGE BETWEEN ...
clause extends the window to the end of the table or partitionLEAD(Champion, 1)
without PARTITION BY
| Year | Event | Champion | Next_Champion |
|------|--------------|----------|---------------|
| 2004 | Discus Throw | LTU | EST |
| 2008 | Discus Throw | EST | GER |
| 2012 | Discus Throw | GER | SWE |
| 2004 | Triple Jump | SWE | POR |
| 2008 | Triple Jump | POR | USA |
| 2012 | Triple Jump | USA | null |
LEAD(Champion, 1)
with PARTITION BY Event
| Year | Event | Champion | Next_Champion |
|------|--------------|----------|---------------|
| 2004 | Discus Throw | LTU | EST |
| 2008 | Discus Throw | EST | GER |
| 2012 | Discus Throw | GER | null |
| 2004 | Triple Jump | SWE | POR |
| 2008 | Triple Jump | POR | USA |
| 2012 | Triple Jump | USA | null |
FIRST_VALUE(Champion)
without PARTITION BY Event
| Year | Event | Champion | First_Champion |
|------|--------------|----------|----------------|
| 2004 | Discus Throw | LTU | LTU |
| 2008 | Discus Throw | EST | LTU |
| 2012 | Discus Throw | GER | LTU |
| 2004 | Triple Jump | SWE | LTU |
| 2008 | Triple Jump | POR | LTU |
| 2012 | Triple Jump | USA | LTU |
FIRST_VALUE(Champion)
with PARTITION BY Event
| Year | Event | Champion | First_Champion |
|------|--------------|----------|----------------|
| 2004 | Discus Throw | LTU | LTU |
| 2008 | Discus Throw | EST | LTU |
| 2012 | Discus Throw | GER | LTU |
| 2004 | Triple Jump | SWE | SWE |
| 2008 | Triple Jump | POR | SWE |
| 2012 | Triple Jump | USA | SWE |
PostgreSQL Summary Stats and Window Functions