Using and creating indexes

Improving Query Performance in PostgreSQL

Amy McCarty

Instructor

Index overview

What

  • Method of creating sorted column keys to improve search
  • Similar to book index
  • Reference to data location

Why

  • Faster queries

Where

  • Common filter columns
  • Primary key
Improving Query Performance in PostgreSQL

Index example

ingredient recipe
tomatoes spaghetti & meatballs
green onions fried rice
eggs fried rice
ground beef spaghetti & meatballs
pasta spaghetti & meatballs
rice fried rice
soy sauce fried rice
SELECT *
FROM cookbook
WHERE recipe = 'fried rice'
Improving Query Performance in PostgreSQL

Index as a key and pointer

Two tables. An index table contains the recipe column along with a column of pointers, which go from _12 to _18 sequentially. The table with index is the same table shown on the left with an additional pointer column. The values are the same in this pointer column as in the pointer column of the Index table. Arrows show how the four values for fried rice in the Index table correspond to the four values for fried rice in the Table with Index.

Improving Query Performance in PostgreSQL

Finding existing indexes

pg_tables
  • Similar to information_schema
    • specific to Postgres
  • Metadata about database
Improving Query Performance in PostgreSQL

Finding existing indexes

pg_tables
  • Similar to information_schema
    • specific to Postgres
  • Metadata about database

 

SELECT * FROM pg_indexes
schemaname tablename indexname tablespace indexdef
food dinner recipe_index null CREATE INDEX recipe_index ...
Improving Query Performance in PostgreSQL

Creating an index

Dinner table with ingredient, recipe, and serving size columns. tomatoes - spaghetti & meatballs - 4. green onions - fried rice - 2. eggs - fried rice - 2. ground beef - spaghetti & meatballs - 4. pasta - spaghetti & meatballs - 4. rice - fried rice - 2. soy sauce - fried rice - 2.

CREATE INDEX recipe_index 
ON cookbook (recipe);
CREATE INDEX CONCURRENTLY recipe_index
ON cookbook (recipe, serving_size);
Improving Query Performance in PostgreSQL

To use or not to use

Use an index

  • Large tables
  • Common filter conditions
  • Primary key

Avoid an index

  • Small tables
  • Columns with many nulls
  • Frequently updated tables
    • Index will become fragmented
    • Writes data in two places
Improving Query Performance in PostgreSQL

Frequently updated tables

Two tables. An index table contains the recipe column along with a column of pointers, which go from _12 to _19 sequentially. The table with index is the same table shown on the left with an additional pointer column. The values are the same in this pointer column as in the pointer column of the Index table. Arrows show how the four values for spaghetti and meatballs in the Index table correspond to the four values for spaghetti and meatballs in the Table with Index. Three of the values are sorted together at the top of the Index table. One spaghetti and meatballs is at the bottom of the Index table, corresponding to the new basil entry in the Table with index.

Improving Query Performance in PostgreSQL

Index query assessment

Query planner

Multiple chefs around a big cooking pot

EXPLAIN
SELECT * 
FROM cookbook

 

Query Plan

Seq scan on cookbook (cost=0.00...22.70
  rows = 1270 width = 36)
  • Cost (time) estimates
Improving Query Performance in PostgreSQL

Let's practice!

Improving Query Performance in PostgreSQL

Preparing Video For Download...