Time Series Analysis in PostgreSQL
Jasmin Ludolf
Content Developer, DataCamp
Range boundaries:
Alternative to range: partition by event or category
Range 1 : 2000-2010
2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
Range 2 : 2010-2020
2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020
PARTITION BY
RANGE
CREATE TABLE timetable (
date_info DATE,
train_id INTEGER,
departure_time TIMESTAMP,
arrival_time TIMESTAMP,
delay INTEGER
) PARTITION BY RANGE (date_info);
PARITION OF
FOR VALUES FROM
timetable
is a partitioned table, partitioned on date_info
using the RANGE
method
CREATE TABLE timetable_y2020 PARTITION OF timetable
FOR VALUES FROM ('2020-01-01') to ('2020-12-31');
OVER
clause indicates a window functionregional_timetable
date, times, id, delays, and regionPARTITION BY
: the window function treats the entire table as one partitionSELECT region, train_id, delay, AVG(delay) OVER (PARTITION BY region)
FROM regional_timetable;
|region|train_id| delay| avg|
|------|--------|--------|--------|
| north| 6|00:00:05|00:00:05|
| north| 2|00:00:03|00:00:05|
| north| 5|00:00:07|00:00:05|
| east| 1|00:00:10|00:00:21|
| east| 3|00:00:32|00:00:21|
| south| 8|00:01:00|00:01:00|
Time Series Analysis in PostgreSQL