PostgreSQL Summary Stats and Window Functions
Michel Semaan
Data Scientist
ROW_NUMBER() always assigns unique numbers, even if two rows' values are the sameRANK() assigns the same number to rows with identical values, skipping over the next numbers in such casesDENSE_RANK() also assigns the same number to rows with identical values, but doesn't skip over the next numbersQuery
SELECT
  Country, COUNT(DISTINCT Year) AS Games
FROM Summer_Medals
WHERE
  Country IN ('GBR', 'DEN', 'FRA',
              'ITA', 'AUT', 'BEL',
              'NOR', 'POL', 'ESP')
GROUP BY Country
ORDER BY Games DESC;
Result
| Country | Games |
|---------|-------|
| GBR     | 27    |
| DEN     | 26    |
| FRA     | 26    |
| ITA     | 25    |
| AUT     | 24    |
| BEL     | 24    |
| NOR     | 22    |
| POL     | 20    |
| ESP     | 18    |
Query
WITH Country_Games AS (...)
SELECT
  Country, Games,
  ROW_NUMBER()
    OVER (ORDER BY Games DESC) AS Row_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
Result
| Country | Games | Row_N |
|---------|-------|-------|
| GBR     | 27    | 1     |
| DEN     | 26    | 2     |
| FRA     | 26    | 3     |
| ITA     | 25    | 4     |
| AUT     | 24    | 5     |
| BEL     | 24    | 6     |
| NOR     | 22    | 7     |
| POL     | 20    | 8     |
| ESP     | 18    | 9     |
Query
WITH Country_Games AS (...)
SELECT
  Country, Games,
  ROW_NUMBER()
    OVER (ORDER BY Games DESC) AS Row_N,
  RANK()
    OVER (ORDER BY Games DESC) AS Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
Result
| Country | Games | Row_N | Rank_N |
|---------|-------|-------|--------|
| GBR     | 27    | 1     | 1      |
| DEN     | 26    | 2     | 2      |
| FRA     | 26    | 3     | 2      |
| ITA     | 25    | 4     | 4      |
| AUT     | 24    | 5     | 5      |
| BEL     | 24    | 6     | 5      |
| NOR     | 22    | 7     | 7      |
| POL     | 20    | 8     | 8      |
| ESP     | 18    | 9     | 9      |
Query
WITH Country_Games AS (...)
SELECT
  Country, Games,
  ROW_NUMBER()
    OVER (ORDER BY Games DESC) AS Row_N,
  RANK()
    OVER (ORDER BY Games DESC) AS Rank_N,
  DENSE_RANK()
    OVER (ORDER BY Games DESC) AS Dense_Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
ROW_NUMBER and RANK will have the same last rank, the count of rowsResult
| Country | Games | Row_N | Rank_N | Dense_Rank_N |
|---------|-------|-------|--------|--------------|
| GBR     | 27    | 1     | 1      | 1            |
| DEN     | 26    | 2     | 2      | 2            |
| FRA     | 26    | 3     | 2      | 2            |
| ITA     | 25    | 4     | 4      | 3            |
| AUT     | 24    | 5     | 5      | 4            |
| BEL     | 24    | 6     | 5      | 5            |
| NOR     | 22    | 7     | 7      | 5            |
| POL     | 20    | 8     | 8      | 6            |
| ESP     | 18    | 9     | 9      | 7            |
DENSE_RANK's last rank is the count of unique values being rankedQuery
SELECT
  Country, Athlete, COUNT(*) AS Medals
FROM Summer_Medals
WHERE
  Country IN ('CHN', 'RUS')
  AND Year = 2012
GROUP BY Country, Athlete
HAVING COUNT(*) > 1
ORDER BY Country ASC, Medals DESC;
Result
| Country | Athlete           | Medals |
|---------|-------------------|--------|
| CHN     | SUN Yang          | 4      |
| CHN     | Guo Shuang        | 3      |
| CHN     | WANG Hao          | 3      |
| ...     | ...               | ...    |
| RUS     | MUSTAFINA Aliya   | 4      |
| RUS     | ANTYUKH Natalya   | 2      |
| RUS     | ISHCHENKO Natalia | 2      |
| ...     | ...               | ...    |
Query
WITH Country_Medals AS (...)
SELECT
  Country, Athlete, Medals,
  DENSE_RANK()
    OVER (ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;
Result
| Country | Athlete           | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN     | SUN Yang          | 4      | 1      |
| CHN     | Guo Shuang        | 3      | 2      |
| CHN     | WANG Hao          | 3      | 2      |
| ...     | ...               | ...    | ...    |
| RUS     | MUSTAFINA Aliya   | 4      | 1      |
| RUS     | ANTYUKH Natalya   | 2      | 3      |
| RUS     | ISHCHENKO Natalia | 2      | 3      |
| ...     | ...               | ...    | ...    |
Query
WITH Country_Medals AS (...)
SELECT
  Country, Athlete,
  DENSE_RANK()
    OVER (PARTITION BY Country
              ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;
Result
| Country | Athlete           | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN     | SUN Yang          | 4      | 1      |
| CHN     | Guo Shuang        | 3      | 2      |
| CHN     | WANG Hao          | 3      | 2      |
| ...     | ...               | ...    | ...    |
| RUS     | MUSTAFINA Aliya   | 4      | 1      |
| RUS     | ANTYUKH Natalya   | 2      | 2      |
| RUS     | ISHCHENKO Natalia | 2      | 2      |
| ...     | ...               | ...    | ...    |
PostgreSQL Summary Stats and Window Functions