Creating pivot tables

Cleaning Data in PostgreSQL Databases

Darryl Reeves, Ph.D.

Assistant Professor, Long Island University - Brooklyn

Multiple category records

              name               |            inspection_type            | grade | ... 
 --------------------------------+---------------------------------------+-------+-----
 ...                             | ...                                   | ...   | ...
 EMPANADAS MONUMENTAL            | Cycle Inspection / Re-inspection      | B     | ...
 ALPHONSO'S PIZZERIA & TRATTORIA | Cycle Inspection / Initial Inspection | A     | ...
 THE SPARROW TAVERN              | Cycle Inspection / Initial Inspection | A     | ...
 BURGER KING                     | Cycle Inspection / Re-inspection      | A     | ...
 ASTORIA PIZZA                   | Cycle Inspection / Re-inspection      | B     | ...
 ...                             | ...                                   | ...   | ...
Cleaning Data in PostgreSQL Databases

Accessing inspection grades by type

SELECT
    inspection_type,
    grade,
    COUNT(*)
FROM
    restaurant_inspection 
WHERE
    grade IS NOT NULL
GROUP BY
    inspection_type,
    grade
ORDER BY 
    inspection_type,
    grade;
Cleaning Data in PostgreSQL Databases

Aggregated inspection results by type

                  inspection_type                  | grade | count 
 --------------------------------------------------+-------+-------
 Cycle Inspection / Initial Inspection             | A     |  1063
 Cycle Inspection / Re-inspection                  | A     |   723
 Cycle Inspection / Re-inspection                  | B     |   270
 Cycle Inspection / Re-inspection                  | C     |    93
 Cycle Inspection / Re-inspection                  | Z     |    29
 Cycle Inspection / Reopening Inspection           | C     |     8
 Cycle Inspection / Reopening Inspection           | P     |    26
 Cycle Inspection / Reopening Inspection           | Z     |     3
 Pre-permit (Non-operational) / Initial Inspection | N     |     4
 Pre-permit (Operational) / Initial Inspection     | A     |   119
 Pre-permit (Operational) / Initial Inspection     | N     |    17
 Pre-permit (Operational) / Re-inspection          | A     |    79
 Pre-permit (Operational) / Re-inspection          | B     |    49
 Pre-permit (Operational) / Re-inspection          | C     |    13
 Pre-permit (Operational) / Re-inspection          | Z     |     9
 Pre-permit (Operational) / Reopening Inspection   | C     |     3
 Pre-permit (Operational) / Reopening Inspection   | P     |     3
 Pre-permit (Operational) / Reopening Inspection   | Z     |     1
Cleaning Data in PostgreSQL Databases

Changing (pivoting) data orientation

                  inspection_type                  |  A   |  B  | C  | N  | P  | Z  
 --------------------------------------------------+------+-----+----+----+----+----
 Cycle Inspection / Re-inspection                  |  723 | 270 | 93 |  0 |  0 | 29
 Cycle Inspection / Initial Inspection             | 1063 |   0 |  0 |  0 |  0 |  0
 Pre-permit (Operational) / Reopening Inspection   |    0 |   0 |  3 |  0 |  3 |  1
 Cycle Inspection / Reopening Inspection           |    0 |   0 |  8 |  0 | 26 |  3
 Pre-permit (Non-operational) / Initial Inspection |    0 |   0 |  0 |  4 |  0 |  0
 Pre-permit (Operational) / Initial Inspection     |  119 |   0 |  0 | 17 |  0 |  0
 Pre-permit (Operational) / Re-inspection          |   79 |  49 | 13 |  0 |  0 |  9
Cleaning Data in PostgreSQL Databases

The FILTER clause

  • Applies an aggregation over a subset of records
  • Subset of records determined by accompanying WHERE clause
  • Used in the SELECT list of a query
Cleaning Data in PostgreSQL Databases

The FILTER clause

  • Example: AVG(qty_sold) FILTER (WHERE qty_sold > 1)
  • Format: AGG_FUNC(expression) FILTER (WHERE condition)
    • AGG_FUNC() - aggregate function
Cleaning Data in PostgreSQL Databases

The pivot table query

SELECT
    summary_column,
    AGG(agg_column) FILTER (WHERE agg_column = PIVOT_VALUE_1) AS "pivot_column_1",
    AGG(agg_column) FILTER (WHERE agg_column = PIVOT_VALUE_2) AS "pivot_column_2",
    ...
    AGG(agg_column) FILTER (WHERE agg_column = PIVOT_VALUE_N) AS "pivot_column_N"
FROM
    source_table
GROUP BY
    summary_column;
Cleaning Data in PostgreSQL Databases

The pivot table output

A diagram showing the layout after pivoting with columns summary_column, pivot_column_1, pivot_column, ..., pivot_column_N. summary_val_1 to summary_val_M populate the summary_column. The text agg result for PV1, PV2, to PVN populate the respective pivot_column columns.

Cleaning Data in PostgreSQL Databases

Pivoting restaurant inspection data

SELECT
    inspection_type,
    COUNT(grade) FILTER (WHERE grade = 'A') AS "A",
    COUNT(grade) FILTER (WHERE grade = 'B') AS "B",
    COUNT(grade) FILTER (WHERE grade = 'C') AS "C",
    COUNT(grade) FILTER (WHERE grade = 'N') AS "N",
    COUNT(grade) FILTER (WHERE grade = 'P') AS "P",
    COUNT(grade) FILTER (WHERE grade = 'Z') AS "Z"
FROM
    restaurant_inspections 
WHERE
    grade IS NOT NULL
GROUP BY
    inspection_type;
Cleaning Data in PostgreSQL Databases

Pivot table output for inspection data

                  inspection_type                  |  A   |  B  | C  | N  | P  | Z  
 --------------------------------------------------+------+-----+----+----+----+----
 Cycle Inspection / Re-inspection                  |  723 | 270 | 93 |  0 |  0 | 29
 Cycle Inspection / Initial Inspection             | 1063 |   0 |  0 |  0 |  0 |  0
 Pre-permit (Operational) / Reopening Inspection   |    0 |   0 |  3 |  0 |  3 |  1
 Cycle Inspection / Reopening Inspection           |    0 |   0 |  8 |  0 | 26 |  3
 Pre-permit (Non-operational) / Initial Inspection |    0 |   0 |  0 |  4 |  0 |  0
 Pre-permit (Operational) / Initial Inspection     |  119 |   0 |  0 | 17 |  0 |  0
 Pre-permit (Operational) / Re-inspection          |   79 |  49 | 13 |  0 |  0 |  9
Cleaning Data in PostgreSQL Databases

Let's practice!

Cleaning Data in PostgreSQL Databases

Preparing Video For Download...