Ranking

PostgreSQL Summary Stats and Window Functions

Michel Semaan

Data Scientist

The ranking functions

  • ROW_NUMBER() always assigns unique numbers, even if two rows' values are the same
  • RANK() assigns the same number to rows with identical values, skipping over the next numbers in such cases
  • DENSE_RANK() also assigns the same number to rows with identical values, but doesn't skip over the next numbers
PostgreSQL Summary Stats and Window Functions

Source table

Query

SELECT
  Country, COUNT(DISTINCT Year) AS Games
FROM Summer_Medals
WHERE
  Country IN ('GBR', 'DEN', 'FRA',
              'ITA', 'AUT', 'BEL',
              'NOR', 'POL', 'ESP')
GROUP BY Country
ORDER BY Games DESC;

Result

| Country | Games |
|---------|-------|
| GBR     | 27    |
| DEN     | 26    |
| FRA     | 26    |
| ITA     | 25    |
| AUT     | 24    |
| BEL     | 24    |
| NOR     | 22    |
| POL     | 20    |
| ESP     | 18    |
PostgreSQL Summary Stats and Window Functions

Different ranking functions - ROW_NUMBER

Query

WITH Country_Games AS (...)

SELECT
  Country, Games,
  ROW_NUMBER()
    OVER (ORDER BY Games DESC) AS Row_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;

Result

| Country | Games | Row_N |
|---------|-------|-------|
| GBR     | 27    | 1     |
| DEN     | 26    | 2     |
| FRA     | 26    | 3     |
| ITA     | 25    | 4     |
| AUT     | 24    | 5     |
| BEL     | 24    | 6     |
| NOR     | 22    | 7     |
| POL     | 20    | 8     |
| ESP     | 18    | 9     |
PostgreSQL Summary Stats and Window Functions

Different ranking functions - RANK

Query

WITH Country_Games AS (...)

SELECT
  Country, Games,
  ROW_NUMBER()
    OVER (ORDER BY Games DESC) AS Row_N,
  RANK()
    OVER (ORDER BY Games DESC) AS Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;

Result

| Country | Games | Row_N | Rank_N |
|---------|-------|-------|--------|
| GBR     | 27    | 1     | 1      |
| DEN     | 26    | 2     | 2      |
| FRA     | 26    | 3     | 2      |
| ITA     | 25    | 4     | 4      |
| AUT     | 24    | 5     | 5      |
| BEL     | 24    | 6     | 5      |
| NOR     | 22    | 7     | 7      |
| POL     | 20    | 8     | 8      |
| ESP     | 18    | 9     | 9      |
PostgreSQL Summary Stats and Window Functions

Different ranking functions - DENSE_RANK

Query

WITH Country_Games AS (...)

SELECT
  Country, Games,
  ROW_NUMBER()
    OVER (ORDER BY Games DESC) AS Row_N,
  RANK()
    OVER (ORDER BY Games DESC) AS Rank_N,
  DENSE_RANK()
    OVER (ORDER BY Games DESC) AS Dense_Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
  • ROW_NUMBER and RANK will have the same last rank, the count of rows

Result

| Country | Games | Row_N | Rank_N | Dense_Rank_N |
|---------|-------|-------|--------|--------------|
| GBR     | 27    | 1     | 1      | 1            |
| DEN     | 26    | 2     | 2      | 2            |
| FRA     | 26    | 3     | 2      | 2            |
| ITA     | 25    | 4     | 4      | 3            |
| AUT     | 24    | 5     | 5      | 4            |
| BEL     | 24    | 6     | 5      | 5            |
| NOR     | 22    | 7     | 7      | 5            |
| POL     | 20    | 8     | 8      | 6            |
| ESP     | 18    | 9     | 9      | 7            |
  • DENSE_RANK's last rank is the count of unique values being ranked
PostgreSQL Summary Stats and Window Functions

Ranking without partitioning - Source table

Query

SELECT
  Country, Athlete, COUNT(*) AS Medals
FROM Summer_Medals
WHERE
  Country IN ('CHN', 'RUS')
  AND Year = 2012
GROUP BY Country, Athlete
HAVING COUNT(*) > 1
ORDER BY Country ASC, Medals DESC;

Result

| Country | Athlete           | Medals |
|---------|-------------------|--------|
| CHN     | SUN Yang          | 4      |
| CHN     | Guo Shuang        | 3      |
| CHN     | WANG Hao          | 3      |
| ...     | ...               | ...    |
| RUS     | MUSTAFINA Aliya   | 4      |
| RUS     | ANTYUKH Natalya   | 2      |
| RUS     | ISHCHENKO Natalia | 2      |
| ...     | ...               | ...    |
PostgreSQL Summary Stats and Window Functions

Ranking without partitioning

Query

WITH Country_Medals AS (...)

SELECT
  Country, Athlete, Medals,
  DENSE_RANK()
    OVER (ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;

Result

| Country | Athlete           | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN     | SUN Yang          | 4      | 1      |
| CHN     | Guo Shuang        | 3      | 2      |
| CHN     | WANG Hao          | 3      | 2      |
| ...     | ...               | ...    | ...    |
| RUS     | MUSTAFINA Aliya   | 4      | 1      |
| RUS     | ANTYUKH Natalya   | 2      | 3      |
| RUS     | ISHCHENKO Natalia | 2      | 3      |
| ...     | ...               | ...    | ...    |
PostgreSQL Summary Stats and Window Functions

Ranking with partitioning

Query

WITH Country_Medals AS (...)

SELECT
  Country, Athlete,
  DENSE_RANK()
    OVER (PARTITION BY Country
              ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;

Result

| Country | Athlete           | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN     | SUN Yang          | 4      | 1      |
| CHN     | Guo Shuang        | 3      | 2      |
| CHN     | WANG Hao          | 3      | 2      |
| ...     | ...               | ...    | ...    |
| RUS     | MUSTAFINA Aliya   | 4      | 1      |
| RUS     | ANTYUKH Natalya   | 2      | 2      |
| RUS     | ISHCHENKO Natalia | 2      | 2      |
| ...     | ...               | ...    | ...    |
PostgreSQL Summary Stats and Window Functions

Let's practice!

PostgreSQL Summary Stats and Window Functions

Preparing Video For Download...