Pagineren

PostgreSQL Samenvattingsstatistieken en vensterfuncties

Michel Semaan

Data Scientist

Wat is pagineren?

  • Pagineren: data opsplitsen in (ongeveer) gelijke stukken
  • Toepassingen
    • Veel API's geven data in "pagina's" terug om minder data te versturen
    • Data in kwartielen of derden (top, midden 33%, en onder) splitsen om prestaties te beoordelen

Maak kennis met NTILE

  • NTILE(n) splitst de data in n ongeveer gelijke pagina's
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Pagineren - brontabel

Query

SELECT
  DISTINCT Discipline
FROM Summer_Medals;
  • Splitst de data in 15 ongeveer even grote pagina's
  • $67 / 15 \simeq 4$, dus elke pagina bevat vier of vijf rijen

Resultaat

| Discipline          |
|---------------------|
| Wrestling Freestyle |
| Archery             |
| Baseball            |
| Lacrosse            |
| Judo                |
| Athletics           |
| ...                 |

(67 rijen)
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Pagineren

Query

WITH Disciplines AS (
  SELECT
    DISTINCT Discipline
  FROM Summer_Medals)

SELECT
  Discipline, NTILE(15) OVER () AS Page
From Disciplines
ORDER BY Page ASC;

Resultaat

| Discipline          | Page |
|---------------------|------|
| Wrestling Freestyle | 1    |
| Archery             | 1    |
| Baseball            | 1    |
| Lacrosse            | 1    |
| Judo                | 1    |
| Athletics           | 2    |
| ...                 | ...  |
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Bovenste, middelste en onderste derden

Query

WITH Country_Medals AS (
  SELECT
    Country, COUNT(*) AS Medals
  FROM Summer_Medals
  GROUP BY Country),

SELECT
  Country, Medals,
  NTILE(3) OVER (ORDER BY Medals DESC) AS Third
FROM Country_Medals;

Resultaat

| Country | Medals | Third |
|---------|--------|-------|
| USA     | 4585   | 1     |
| URS     | 2049   | 1     |
| GBR     | 1720   | 1     |
| ...     | ...    | ...   |
| CZE     | 56     | 2     |
| LTU     | 55     | 2     |
| ...     | ...    | ...   |
| DOM     | 6      | 3     |
| BWI     | 5      | 3     |
| ...     | ...    | ...   |
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Gemiddelden per derde

Query

WITH Country_Medals AS (...),

  Thirds AS (
  SELECT
    Country, Medals,
    NTILE(3) OVER (ORDER BY Medals DESC) AS Third
  FROM Country_Medals)

SELECT
  Third,
  ROUND(AVG(Medals), 2) AS Avg_Medals
FROM Thirds
GROUP BY Third
ORDER BY Third ASC;

Resultaat

| Third | Avg_Medals |
|-------|------------|
| 1     | 598.74     |
| 2     | 22.98      |
| 3     | 2.08       |
PostgreSQL Samenvattingsstatistieken en vensterfuncties

Laten we oefenen!

PostgreSQL Samenvattingsstatistieken en vensterfuncties

Preparing Video For Download...