PostgreSQL Summary Stats and Window Functions
Michel Semaan
Data Scientist
ROW_NUMBER()
always assigns unique numbers, even if two rows' values are the sameRANK()
assigns the same number to rows with identical values, skipping over the next numbers in such casesDENSE_RANK()
also assigns the same number to rows with identical values, but doesn't skip over the next numbersQuery
SELECT
Country, COUNT(DISTINCT Year) AS Games
FROM Summer_Medals
WHERE
Country IN ('GBR', 'DEN', 'FRA',
'ITA', 'AUT', 'BEL',
'NOR', 'POL', 'ESP')
GROUP BY Country
ORDER BY Games DESC;
Result
| Country | Games |
|---------|-------|
| GBR | 27 |
| DEN | 26 |
| FRA | 26 |
| ITA | 25 |
| AUT | 24 |
| BEL | 24 |
| NOR | 22 |
| POL | 20 |
| ESP | 18 |
Query
WITH Country_Games AS (...)
SELECT
Country, Games,
ROW_NUMBER()
OVER (ORDER BY Games DESC) AS Row_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
Result
| Country | Games | Row_N |
|---------|-------|-------|
| GBR | 27 | 1 |
| DEN | 26 | 2 |
| FRA | 26 | 3 |
| ITA | 25 | 4 |
| AUT | 24 | 5 |
| BEL | 24 | 6 |
| NOR | 22 | 7 |
| POL | 20 | 8 |
| ESP | 18 | 9 |
Query
WITH Country_Games AS (...)
SELECT
Country, Games,
ROW_NUMBER()
OVER (ORDER BY Games DESC) AS Row_N,
RANK()
OVER (ORDER BY Games DESC) AS Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
Result
| Country | Games | Row_N | Rank_N |
|---------|-------|-------|--------|
| GBR | 27 | 1 | 1 |
| DEN | 26 | 2 | 2 |
| FRA | 26 | 3 | 2 |
| ITA | 25 | 4 | 4 |
| AUT | 24 | 5 | 5 |
| BEL | 24 | 6 | 5 |
| NOR | 22 | 7 | 7 |
| POL | 20 | 8 | 8 |
| ESP | 18 | 9 | 9 |
Query
WITH Country_Games AS (...)
SELECT
Country, Games,
ROW_NUMBER()
OVER (ORDER BY Games DESC) AS Row_N,
RANK()
OVER (ORDER BY Games DESC) AS Rank_N,
DENSE_RANK()
OVER (ORDER BY Games DESC) AS Dense_Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
ROW_NUMBER
and RANK
will have the same last rank, the count of rowsResult
| Country | Games | Row_N | Rank_N | Dense_Rank_N |
|---------|-------|-------|--------|--------------|
| GBR | 27 | 1 | 1 | 1 |
| DEN | 26 | 2 | 2 | 2 |
| FRA | 26 | 3 | 2 | 2 |
| ITA | 25 | 4 | 4 | 3 |
| AUT | 24 | 5 | 5 | 4 |
| BEL | 24 | 6 | 5 | 5 |
| NOR | 22 | 7 | 7 | 5 |
| POL | 20 | 8 | 8 | 6 |
| ESP | 18 | 9 | 9 | 7 |
DENSE_RANK
's last rank is the count of unique values being rankedQuery
SELECT
Country, Athlete, COUNT(*) AS Medals
FROM Summer_Medals
WHERE
Country IN ('CHN', 'RUS')
AND Year = 2012
GROUP BY Country, Athlete
HAVING COUNT(*) > 1
ORDER BY Country ASC, Medals DESC;
Result
| Country | Athlete | Medals |
|---------|-------------------|--------|
| CHN | SUN Yang | 4 |
| CHN | Guo Shuang | 3 |
| CHN | WANG Hao | 3 |
| ... | ... | ... |
| RUS | MUSTAFINA Aliya | 4 |
| RUS | ANTYUKH Natalya | 2 |
| RUS | ISHCHENKO Natalia | 2 |
| ... | ... | ... |
Query
WITH Country_Medals AS (...)
SELECT
Country, Athlete, Medals,
DENSE_RANK()
OVER (ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;
Result
| Country | Athlete | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN | SUN Yang | 4 | 1 |
| CHN | Guo Shuang | 3 | 2 |
| CHN | WANG Hao | 3 | 2 |
| ... | ... | ... | ... |
| RUS | MUSTAFINA Aliya | 4 | 1 |
| RUS | ANTYUKH Natalya | 2 | 3 |
| RUS | ISHCHENKO Natalia | 2 | 3 |
| ... | ... | ... | ... |
Query
WITH Country_Medals AS (...)
SELECT
Country, Athlete,
DENSE_RANK()
OVER (PARTITION BY Country
ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;
Result
| Country | Athlete | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN | SUN Yang | 4 | 1 |
| CHN | Guo Shuang | 3 | 2 |
| CHN | WANG Hao | 3 | 2 |
| ... | ... | ... | ... |
| RUS | MUSTAFINA Aliya | 4 | 1 |
| RUS | ANTYUKH Natalya | 2 | 2 |
| RUS | ISHCHENKO Natalia | 2 | 2 |
| ... | ... | ... | ... |
PostgreSQL Summary Stats and Window Functions