PostgreSQL: Zusammenfassende Statistiken und Fensterfunktionen
Michel Semaan
Data Scientist
ROW_NUMBER() weist immer eindeutige Nummern zu, auch bei identischen Werten in zwei ZeilenRANK() weist Zeilen mit identischen Werten gleiche Nummer zu und überspringt die nächsten NummernDENSE_RANK() weist auch Zeilen mit identischen Werten die gleiche Nummer zu, überspringt aber nicht die nächsten NummernAbfrage
SELECT
Country, COUNT(DISTINCT Year) AS Games
FROM Summer_Medals
WHERE
Country IN ('GBR', 'DEN', 'FRA',
'ITA', 'AUT', 'BEL',
'NOR', 'POL', 'ESP')
GROUP BY Country
ORDER BY Games DESC;
Ergebnis
| Country | Games |
|---------|-------|
| GBR | 27 |
| DEN | 26 |
| FRA | 26 |
| ITA | 25 |
| AUT | 24 |
| BEL | 24 |
| NOR | 22 |
| POL | 20 |
| ESP | 18 |
Abfrage
WITH Country_Games AS (...)
SELECT
Country, Games,
ROW_NUMBER()
OVER (ORDER BY Games DESC) AS Row_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
Ergebnis
| Country | Games | Row_N |
|---------|-------|-------|
| GBR | 27 | 1 |
| DEN | 26 | 2 |
| FRA | 26 | 3 |
| ITA | 25 | 4 |
| AUT | 24 | 5 |
| BEL | 24 | 6 |
| NOR | 22 | 7 |
| POL | 20 | 8 |
| ESP | 18 | 9 |
Abfrage
WITH Country_Games AS (...)
SELECT
Country, Games,
ROW_NUMBER()
OVER (ORDER BY Games DESC) AS Row_N,
RANK()
OVER (ORDER BY Games DESC) AS Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
Ergebnis
| Country | Games | Row_N | Rank_N |
|---------|-------|-------|--------|
| GBR | 27 | 1 | 1 |
| DEN | 26 | 2 | 2 |
| FRA | 26 | 3 | 2 |
| ITA | 25 | 4 | 4 |
| AUT | 24 | 5 | 5 |
| BEL | 24 | 6 | 5 |
| NOR | 22 | 7 | 7 |
| POL | 20 | 8 | 8 |
| ESP | 18 | 9 | 9 |
Abfrage
WITH Country_Games AS (...)
SELECT
Country, Games,
ROW_NUMBER()
OVER (ORDER BY Games DESC) AS Row_N,
RANK()
OVER (ORDER BY Games DESC) AS Rank_N,
DENSE_RANK()
OVER (ORDER BY Games DESC) AS Dense_Rank_N
FROM Country_Games
ORDER BY Games DESC, Country ASC;
ROW_NUMBER und RANK haben denselben letzten Rang, nämlich Anzahl der ZeilenErgebnis
| Country | Games | Row_N | Rank_N | Dense_Rank_N |
|---------|-------|-------|--------|--------------|
| GBR | 27 | 1 | 1 | 1 |
| DEN | 26 | 2 | 2 | 2 |
| FRA | 26 | 3 | 2 | 2 |
| ITA | 25 | 4 | 4 | 3 |
| AUT | 24 | 5 | 5 | 4 |
| BEL | 24 | 6 | 5 | 5 |
| NOR | 22 | 7 | 7 | 5 |
| POL | 20 | 8 | 8 | 6 |
| ESP | 18 | 9 | 9 | 7 |
DENSE_RANK hat als letzter Rang die Anzahl der einzigartigen Werte der RanglisteAbfrage
SELECT
Country, Athlete, COUNT(*) AS Medals
FROM Summer_Medals
WHERE
Country IN ('CHN', 'RUS')
AND Year = 2012
GROUP BY Country, Athlete
HAVING COUNT(*) > 1
ORDER BY Country ASC, Medals DESC;
Ergebnis
| Country | Athlete | Medals |
|---------|-------------------|--------|
| CHN | SUN Yang | 4 |
| CHN | Guo Shuang | 3 |
| CHN | WANG Hao | 3 |
| ... | ... | ... |
| RUS | MUSTAFINA Aliya | 4 |
| RUS | ANTYUKH Natalya | 2 |
| RUS | ISHCHENKO Natalia | 2 |
| ... | ... | ... |
Abfrage
WITH Country_Medals AS (...)
SELECT
Country, Athlete, Medals,
DENSE_RANK()
OVER (ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;
Ergebnis
| Country | Athlete | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN | SUN Yang | 4 | 1 |
| CHN | Guo Shuang | 3 | 2 |
| CHN | WANG Hao | 3 | 2 |
| ... | ... | ... | ... |
| RUS | MUSTAFINA Aliya | 4 | 1 |
| RUS | ANTYUKH Natalya | 2 | 3 |
| RUS | ISHCHENKO Natalia | 2 | 3 |
| ... | ... | ... | ... |
Abfrage
WITH Country_Medals AS (...)
SELECT
Country, Athlete,
DENSE_RANK()
OVER (PARTITION BY Country
ORDER BY Medals DESC) AS Rank_N
FROM Country_Medals
ORDER BY Country ASC, Medals DESC;
Ergebnis
| Country | Athlete | Medals | Rank_N |
|---------|-------------------|--------|--------|
| CHN | SUN Yang | 4 | 1 |
| CHN | Guo Shuang | 3 | 2 |
| CHN | WANG Hao | 3 | 2 |
| ... | ... | ... | ... |
| RUS | MUSTAFINA Aliya | 4 | 1 |
| RUS | ANTYUKH Natalya | 2 | 2 |
| RUS | ISHCHENKO Natalia | 2 | 2 |
| ... | ... | ... | ... |
PostgreSQL: Zusammenfassende Statistiken und Fensterfunktionen