Time Series Analysis in PostgreSQL
Jasmin Ludolf
Content Developer, DataCamp

SELECT station, arrival_time
FROM train_schedule
WHERE train_id = 324
|station      |arrival_time|
|-------------|------------|
|San Francisco|    07:59:00|
|22nd Street  |    08:03:00|
|Millbrae     |    08:16:00|
|Hillsdale    |    08:24:00|
|Redwood City |    08:31:00|
|Palo Alto    |    08:37:00|
...
SELECT year_month, t_monthly_min, t_monthly_max, t_monthly_avg
FROM temperature_stations AS ts
JOIN temperatures_monthly AS tm USING(station_id)
WHERE station_id = 1
|year_month|t_monthly_min|t_monthly_max|t_monthly_avg|
|----------|-------------|-------------|-------------|
|2010-01-01|          6.7|         20.3|         13.6|
|2010-02-01|          8.3|         20.8|         14.8|
|2010-03-01|          9.1|         23.9|         17.0|
|2010-04-01|         11.3|         27.1|         20.0|
|2010-05-01|         15.0|         32.1|         24.8|
...
SELECT ts, views
FROM dc_news_fact
WHERE id = '12561'
ORDER BY ts;
|ts                 |views|
|-------------------|-----|
|2015-12-29 11:40:23| 4161|
|2015-12-29 12:00:23|  407|
|2015-12-29 12:20:23|    0|
|2015-12-29 12:40:23|  778|
|2015-12-29 13:00:23|  396|
...
DATE_TRUNC(): alters granularity of time series data
DATE_TRUNC(field, source, time zone)SELECT DATE_TRUNC('hour', ts) AS hour,SUM(VIEWS) AS viewsFROM dc_news_fact WHERE id = '12561' GROUP BY hour ORDER BY hour;
|hour               |views|
|-------------------|-----|
|2015-12-29 09:00:00|    0|
|2015-12-29 10:00:00|    0|
|2015-12-29 11:00:00| 4161|
|2015-12-29 12:00:00| 1185|
|2015-12-29 13:00:00| 1146|
|2015-12-29 14:00:00|  697|
|2015-12-29 15:00:00| 1013|
|2015-12-29 16:00:00|  956|
|2015-12-29 17:00:00| 1307|
|2015-12-29 18:00:00|  700|
...
DATE_PART(): extracts specific data
DATE_PART(field, source)
EXTRACT() also extracts specific dataDATE_PART()DATE_PART() returns a result in double precision which is impreciseSELECT DATE_PART('hour', ts) AS hour_of_day, SUM(VIEWS) AS viewsFROM dc_news_fact WHERE id = '12561' GROUP BY hour_of_day ORDER BY hour_of_day;
|hour_of_day|views|
|-----------|-----|
|          0|  500|
|          1|  500|
|          2|  400|
|          3|  400|
|          4| 1200|
|          5|  100|
|          6|  100|
|          7| 2000|
|          8|  200|
|          9| 2500|
...
SELECT
    EXTRACT(dow FROM ts)
    AS day_of_week,
    SUM(VIEWS) AS views
FROM dc_news_fact
WHERE id = '12561'
GROUP BY day_of_week
ORDER BY day_of_week;
|day_of_week|views|
|-----------|-----|
|          2|15265|
|          3|13100|
|          4| 1200|
Time Series Analysis in PostgreSQL