Transactions

Introduction to Redshift

Jason Myers

Principal Engineer

Motivation for using transactions

SELECT name,
       priority,
  FROM data_log
       -- SYSDATE = 2024-02-07 00:17:24.259227
 WHERE intake_ts < SYSDATE;

SELECT name,
       data_size,
  FROM data_details
       -- SYSDATE = 2024-02-07 00:18:04.830527
 WHERE current_intake_date < SYSDATE;
Introduction to Redshift

Statement grouping example

data_intake

name priority
idaho_monitoring_locations 1
idaho_samples 2
idaho_site_id 3
UPDATE data_intake 
   SET priority=1 
 WHERE name='idaho_samples';

UPDATE data_intake 
   SET priority=2 
 WHERE name='idaho_monitoring_locations';
Introduction to Redshift

Errored table results

data_intake

name priority
idaho_monitoring_locations 1
idaho_samples 1
idaho_site_id 3
Introduction to Redshift

Transaction advantages and considerations

  • Consistent data outcomes
  • Requiring success or failure for a group of queries
  • Concurrent operations

Default Execution Behavior

  • Each SQL statement is a transaction!

Transactions Affect some Functions

  • Set at start of transactions and stay the same
    • SYSDATE, TIMESTAMP, CURRENT_DATE

Some functions skirt around Transactions

  • Set at each statement execution
    • GETDATE, TIMEOFDAY
Introduction to Redshift

Transactions Structure

  • Opens with BEGIN; or START TRANSACTION;
  • Contains one or more SQL statements with a semicolon after each one
  • Closes with END; or COMMIT;
  • NOTE: Semicolons matter
BEGIN;

query1; query2;
END;
Introduction to Redshift

Getting consistent query results

-- Start a transaction
BEGIN;
SELECT name,
       priority,
  FROM data_log
       -- SYSDATE = 2024-02-07 00:17:24.259227
 WHERE intake_ts < SYSDATE;

SELECT name,
       data_size,
  FROM data_details
       -- SYSDATE = 2024-02-07 00:17:24.259227
 WHERE current_intake_date < SYSDATE;
-- End a transaction
END;
Introduction to Redshift

Function behavior in transactions

-- Start a transaction
BEGIN;
SELECT name,
       priority,
  FROM data_intake
       -- GETDATE = 2024-02-07 00:17:24.259227
 WHERE data_intake_ts < GETDATE();

 SELECT name,
       data_size,
  FROM data_details
       -- GETDATE = 2024-02-07 00:18:44.830527
 WHERE current_intake_date < GETDATE();
-- End a transaction
END;
Introduction to Redshift

Let's practice!

Introduction to Redshift

Preparing Video For Download...