Queryprestaties begrijpen

Introductie tot Redshift

Jason Myers

Principal Architect

Query-optimalisatieproces

  1. Check de tabel STL_ALERT_EVENT_LOG
  2. Voer EXPLAIN uit op je query
  3. Check de tabellen SVL_QUERY_SUMMARY en SVL_QUERY_REPORT
Introductie tot Redshift

Tabel STL_ALERT_EVENT_LOG

Bevat alle Redshift-waarschuwingen die door een query zijn getriggerd

SELECT * 
  FROM stl_alert_event_log 
 WHERE query = 1337;
 query |             event             |          solution       |     event_time      
=======+===============================+=========================+====================
  1337 | Missing query planner statist | Run the ANALYZE command | 2023-11-13 18:20:58
Introductie tot Redshift

Het uitvoeringsplan met EXPLAIN

  • Toont alle stappen van een query en hun relatieve kosten en gescande rijen

  • Werkt voor alle DML-statements:

    • SELECT
    • SELECT INTO
    • CREATE TABLE AS
    • INSERT
    • UPDATE
    • DELETE
Introductie tot Redshift

EXPLAIN-voorbeeld

-- EXPLAIN draaien op onze query voor de 
-- top tien divisies op omzet
EXPLAIN WITH top_ten_divisions_by_rev AS
(
  SELECT division_id,
         SUM(revenue) AS revenue_total
    FROM sales_data
   GROUP BY division_id
   ORDER BY revenue_total DESC
   LIMIT 10
),
division_names AS
(
  SELECT id AS division_id,
         name AS division_name
    FROM division_names
)
SELECT division_name,
       revenue_total
  FROM top_ten_divisions_by_rev
  JOIN division_names USING (DIVISION_ID)
 WHERE revenue_total > 100000;
Introductie tot Redshift

Uitlegresultaten

|QUERY PLAN                                                                                      |
+------------------------------------------------------------------------------------------------+
|Hash Join  (cost=47.11..58.89 rows=3 width=524)                                                 |
|  Hash Cond: (division_names.id = top_ten_divisions_by_rev.division_id)                         |

| -> Seq Scan on division_names (cost=0.00..11.40 rows=140 width=520) | | -> Hash (cost=47.07..47.07 rows=3 width=12) |
| -> Subquery Scan on top_ten_divisions_by_rev (cost=46.92..47.07 rows=3 width=12) | | Filter: (top_ten_divisions_by_rev.revenue_total > 100000) | | -> Limit (cost=46.92..46.95 rows=10 width=12) |
| -> Sort (cost=46.92..47.42 rows=200 width=12) | | Sort Key: (sum(sales_data.revenue)) DESC |
| -> HashAggregate (cost=40.60..42.60 rows=200 width=12) | | Group Key: sales_data.division_id | | -> Seq Scan on sales_data (cost=0.00..30.40 rows=2040 width=8)|
1 https://www.postgresql.org/docs/current/using-explain.html
Introductie tot Redshift

Laten we oefenen!

Introductie tot Redshift

Preparing Video For Download...