Row-oriented storage and partitions

Improving Query Performance in PostgreSQL

Amy McCarty

Instructor

Database storage types

Row oriented storage

  • Relation between columns retained

Column-oriented storage

  • Relation between rows retained
id name species age habitat received
01 Bob panda 2 Asia 2018
02 Sunny zebra 3 Africa 2018
03 Beco zebra 10 Africa 2017
04 Coco koala 5 Australia 2016
Improving Query Performance in PostgreSQL

Row-oriented

Row-oriented storage

  • Relation between columns retained

 

id name species age habitat received
01 Bob panda 2 Asia 2018
Improving Query Performance in PostgreSQL

Column-oriented

Column-oriented storage

  • Relation between rows retained  

Two tables. The first table is a list of ids and names. 01 - Bob. 02 - Sunny. 03 - Beco. 04 - Coco. The second table is the same ids with a list of species. 01 - panda. 02 - zebra. 03 - zebra. 04 - koala.

Improving Query Performance in PostgreSQL

Row-oriented storage

 

  • One row stored in same location
  • Fast to append or delete whole records
  • Quick to return all columns
    • Slow to return all rows
Improving Query Performance in PostgreSQL

Reducing the rows

Reduce the number of rows

  • WHERE filter
  • INNER JOIN
  • DISTINCT
  • LIMIT
Improving Query Performance in PostgreSQL

Row-oriented database methods

Partitions

  • Method of splitting one (parent) table into many, smaller (children) tables

Indexes

  • Method of creating sorted column keys to improve search

 

Using partitions and indexes

  • Require set up and maintenance
  • Existence known from database administrator or documentation
Improving Query Performance in PostgreSQL

Partition structure

A parent table of zoo animals with four records showing id, name, species, habitat, and year received. 01 - Bob - Panda - age - habitat - received. 02 - Sunny - Zebra - 3 - Africa - 2018. 03 - Beco - Zebra - 10 - Africa - 2017. 04 - Coco - Macaw - 5 - South America - 2016. There are 3 children tables corresponding to the habitat columns. The Asia table has one record for Bob. The Africa table has two records for Sunny and Beco. The South America table has one record cor Coco.

  • Parent table
    • Visible in database front end
    • Write queries
  • Children tables
    • Not visible in database front end
    • Queries search
Improving Query Performance in PostgreSQL

Partition structure

A parent table of zoo animals with four records showing id, name, species, habitat, and year received. 01 - Bob - Panda - age - habitat - received. 02 - Sunny - Zebra - 3 - Africa - 2018. 03 - Beco - Zebra - 10 - Africa - 2017. 04 - Coco - Macaw - 5 - South America - 2016. There are 3 children tables corresponding to the habitat columns. The Asia table has one record for Bob. The Africa table has two records for Sunny and Beco. The South America table has one record cor Coco. The Africa table is outlined in red.

SELECT species
FROM zoo_animals
WHERE habitat = 'Africa'
Improving Query Performance in PostgreSQL

Partition overview

What

  • Splitting of one table into many smaller tables

Why

  • Storage flexibility
  • Faster queries

Where

  • Common filter columns
    • Date, location
Improving Query Performance in PostgreSQL

Partition query assessment

Query planner

Multiple chefs around a big cooking pot

EXPLAIN
SELECT species
FROM zoo_animals
WHERE habitat = 'Africa'

Query Plan

Seq Scan on zoo_animals  (cost=0.00..
17.70 rows=2 width=182)
  Filter: (state_code = 15)
  • Cost (time) estimates
Improving Query Performance in PostgreSQL

Let's practice!

Improving Query Performance in PostgreSQL

Preparing Video For Download...