Cleaning Data in PostgreSQL Databases
Darryl Reeves, Ph.D.
Assistant Professor, Long Island University - Brooklyn
name | inspection_type | grade | ...
--------------------------------+---------------------------------------+-------+-----
... | ... | ... | ...
EMPANADAS MONUMENTAL | Cycle Inspection / Re-inspection | B | ...
ALPHONSO'S PIZZERIA & TRATTORIA | Cycle Inspection / Initial Inspection | A | ...
THE SPARROW TAVERN | Cycle Inspection / Initial Inspection | A | ...
BURGER KING | Cycle Inspection / Re-inspection | A | ...
ASTORIA PIZZA | Cycle Inspection / Re-inspection | B | ...
... | ... | ... | ...
SELECT
inspection_type,
grade,
COUNT(*)
FROM
restaurant_inspection
WHERE
grade IS NOT NULL
GROUP BY
inspection_type,
grade
ORDER BY
inspection_type,
grade;
inspection_type | grade | count
--------------------------------------------------+-------+-------
Cycle Inspection / Initial Inspection | A | 1063
Cycle Inspection / Re-inspection | A | 723
Cycle Inspection / Re-inspection | B | 270
Cycle Inspection / Re-inspection | C | 93
Cycle Inspection / Re-inspection | Z | 29
Cycle Inspection / Reopening Inspection | C | 8
Cycle Inspection / Reopening Inspection | P | 26
Cycle Inspection / Reopening Inspection | Z | 3
Pre-permit (Non-operational) / Initial Inspection | N | 4
Pre-permit (Operational) / Initial Inspection | A | 119
Pre-permit (Operational) / Initial Inspection | N | 17
Pre-permit (Operational) / Re-inspection | A | 79
Pre-permit (Operational) / Re-inspection | B | 49
Pre-permit (Operational) / Re-inspection | C | 13
Pre-permit (Operational) / Re-inspection | Z | 9
Pre-permit (Operational) / Reopening Inspection | C | 3
Pre-permit (Operational) / Reopening Inspection | P | 3
Pre-permit (Operational) / Reopening Inspection | Z | 1
inspection_type | A | B | C | N | P | Z
--------------------------------------------------+------+-----+----+----+----+----
Cycle Inspection / Re-inspection | 723 | 270 | 93 | 0 | 0 | 29
Cycle Inspection / Initial Inspection | 1063 | 0 | 0 | 0 | 0 | 0
Pre-permit (Operational) / Reopening Inspection | 0 | 0 | 3 | 0 | 3 | 1
Cycle Inspection / Reopening Inspection | 0 | 0 | 8 | 0 | 26 | 3
Pre-permit (Non-operational) / Initial Inspection | 0 | 0 | 0 | 4 | 0 | 0
Pre-permit (Operational) / Initial Inspection | 119 | 0 | 0 | 17 | 0 | 0
Pre-permit (Operational) / Re-inspection | 79 | 49 | 13 | 0 | 0 | 9
WHERE
clauseSELECT
list of a queryAVG(qty_sold) FILTER (WHERE qty_sold > 1)
AGG_FUNC(expression) FILTER (WHERE condition)
AGG_FUNC()
- aggregate functionSELECT
summary_column,
AGG(agg_column) FILTER (WHERE agg_column = PIVOT_VALUE_1) AS "pivot_column_1",
AGG(agg_column) FILTER (WHERE agg_column = PIVOT_VALUE_2) AS "pivot_column_2",
...
AGG(agg_column) FILTER (WHERE agg_column = PIVOT_VALUE_N) AS "pivot_column_N"
FROM
source_table
GROUP BY
summary_column;
SELECT
inspection_type,
COUNT(grade) FILTER (WHERE grade = 'A') AS "A",
COUNT(grade) FILTER (WHERE grade = 'B') AS "B",
COUNT(grade) FILTER (WHERE grade = 'C') AS "C",
COUNT(grade) FILTER (WHERE grade = 'N') AS "N",
COUNT(grade) FILTER (WHERE grade = 'P') AS "P",
COUNT(grade) FILTER (WHERE grade = 'Z') AS "Z"
FROM
restaurant_inspections
WHERE
grade IS NOT NULL
GROUP BY
inspection_type;
inspection_type | A | B | C | N | P | Z
--------------------------------------------------+------+-----+----+----+----+----
Cycle Inspection / Re-inspection | 723 | 270 | 93 | 0 | 0 | 29
Cycle Inspection / Initial Inspection | 1063 | 0 | 0 | 0 | 0 | 0
Pre-permit (Operational) / Reopening Inspection | 0 | 0 | 3 | 0 | 3 | 1
Cycle Inspection / Reopening Inspection | 0 | 0 | 8 | 0 | 26 | 3
Pre-permit (Non-operational) / Initial Inspection | 0 | 0 | 0 | 4 | 0 | 0
Pre-permit (Operational) / Initial Inspection | 119 | 0 | 0 | 17 | 0 | 0
Pre-permit (Operational) / Re-inspection | 79 | 49 | 13 | 0 | 0 | 9
Cleaning Data in PostgreSQL Databases