Survey of useful functions

Analyzing Business Data in SQL

Michel Semaan

Data Scientist

Dealing with dates

  • DATE_TRUNC('quarter', '2018-08-13') ? '2018-07-01 00:00:00+00:00'
  • '2018-07-01 00:00:00+00:00' :: DATE ? '2018-07-01'

Dates in reports

  • Human-readable dates are important in reporting
  • The default date format, '2018-08-13', isn't very readable
  • How do you get from '2018-08-13' to 'Friday 13, August 2018'?

Solution

  • TO_CHAR('2018-08-13', 'FMDay DD, FMMonth YYYY') ? 'Friday 13, August 2018'
Analyzing Business Data in SQL

TO_CHAR()

  • TO_CHAR(DATE, TEXT) ? TEXT (the formatted date string)
  • Example: Dy ? Abbreviated day name (Mon, Tues, etc.)

    • TO_CHAR('2018-06-01', 'Dy') ? 'Fri'
    • TO_CHAR('2018-06-02', 'Dy') ? 'Sat'
  • Patterns in the format string will be replaced by what they represent in the date; other characters remain as-is

  • Example: DD ? Day number (01 - 31)
    • TO_CHAR('2018-06-01', 'Dy - DD') ? 'Fri - 01'
    • TO_CHAR('2018-06-02', 'Dy - DD') ? 'Sat - 02'
Analyzing Business Data in SQL
Pattern Description
FMDay Full day name (Monday, Tuesday, etc.)
MM Month of year (01 - 12)
Mon Abbreviated month name (Jan, Feb, etc.)
FMMonth Full month name (January, February, etc.)
YY Last 2 digits of year (18, 19, etc.)
YYYY Full 4-digit year (2018, 2019, etc.)

Documentation: https://www.postgresql.org/docs/9.6/functions-formatting.html

Analyzing Business Data in SQL

Query

SELECT DISTINCT
  order_date,
  TO_CHAR(order_date,
          'FMDay DD, FMMonth YYYY') AS format_1,
  TO_CHAR(order_date,
          'Dy DD Mon/YYYY') AS format_2
FROM orders
ORDER BY order_date ASC
LIMIT 3;

Result

order_date  format_1                format_2       
----------  ----------------------  ---------------
2018-06-01  Friday 01, June 2018    Fri 01/Jun 2018
2018-06-02  Saturday 02, June 2018  Sat 02/Jun 2018
2018-06-02  Sunday 03, June 2018    Sun 03/Jun 2018
Analyzing Business Data in SQL

Window functions revisited

  • SUM(...) OVER (...): Calculates a column's running total
    • Example: SUM(registrations) OVER (ORDER BY registration_month) calculates the registrations running total
  • LAG(...) OVER (...): Fetches a preceding row's value
    • Example: LAG(mau) OVER (ORDER BY active_month) returns the previous month's active users (MAU)
  • RANK() OVER (...): Assigns a rank to each row based on that row's position in a sorted order
    • Example: RANK() OVER (ORDER BY revenue DESC) ranks users, eateries, or months by the revenue they've generated
Analyzing Business Data in SQL

Query

SELECT
  user_id,
  SUM(meal_price * order_quantity) AS revenue
FROM meals
JOIN orders ON meals.meal_id = orders.meal_id
GROUP BY user_id
ORDER BY revenue DESC
LIMIT 3;

Result

user_id  revenue
-------  -------
18       626
76       553.25
73       537
Analyzing Business Data in SQL

Query

WITH user_revenues AS (
  SELECT
    user_id,
    SUM(meal_price * order_quantity) AS revenue
  FROM meals
  JOIN orders ON meals.meal_id = orders.meal_id
  GROUP BY user_id)

SELECT
  user_id,
  RANK() OVER (ORDER BY revenue DESC)
    AS revenue_rank
FROM user_revenues
ORDER BY revenue_rank DESC
LIMIT 3;

Result

user_id  revenue_rank
-------  ------------
18       1
76       2
73       3
Analyzing Business Data in SQL

Survey of useful functions

Analyzing Business Data in SQL

Preparing Video For Download...