Meet Redshift, a columnar database

Introduction to Redshift

Jason Myers

Principal Architect

Redshift overview

  • Distributed
  • Columnar database
  • Uses PostgreSQL 9 syntax with some enhancements
  • Serverless and provisioned clusters

Redshift logo

Introduction to Redshift

Columnar vs row based databases

Row-based

Row-based database example

Columnar

Columnar database example

Introduction to Redshift

Redshift's position in the AWS ecosystem

  • Primary SQL data warehouse
  • Can be the center of our AWS environment for data lakehouse capabilities
  • Supports external schemas in
    • AWS RDS for PostreSQL
    • AWS Aurora PostgreSQL-compatible edition
    • Amazon EMR for Hive support
    • Amazon Kinesis for streaming data support
Introduction to Redshift

Redshift's position in the AWS ecosystem

  • Federated queries with other AWS RDS databases and more
  • Integrates with Amazon SageMaker via Redshift ML to empower data scientists.
Introduction to Redshift

Competitors

  • Snowflake, BigQuery, Databricks Lakehouse, Azure Synapse Analytics

Advantages

  • AWS Integration: no copy data sharing with other services
  • Share-nothing architecture
  • Optimized repeat query performance
  • Reserved instance (RI) pricing

Disadvantages

  • AWS only
  • Unoptimized query performance
  • Flexibility to support external files costs extra
Introduction to Redshift

Viewing databases and schemas

  • Use SVV_REDSHIFT_SCHEMAS (internal) or SVV_ALL_SCHEMAS (internal and external)
-- View the database and schema names with the schema type
SELECT database_name, 
       schema_name, 
       schema_type 
  -- From the internal schemas view
  FROM SVV_REDSHIFT_SCHEMAS;
database_name        | schema_name         | schema_type
=====================|=====================| ===========
datacamp_course_prod | information_schema  | local
datacamp_course_prod | pg_catalog          | local
datacamp_course_prod | public              | local
dev                  | information_schema  | local
dev                  | pg_catalog          | local
dev                  | public              | local
Introduction to Redshift

Viewing tables in a schema

  • Use SVV_REDSHIFT_TABLES (internal) or SVV_ALL_TABLES (internal and external)
-- View the table name
SELECT table_name 
  -- Using a view with both internal and external tables
  FROM SVV_ALL_TABLES
 -- In the external spectrumdb schema
 WHERE schema_name = 'spectrumdb';
table_name            
======================
ecommerce_sales
global_power_plant_db
coffee_county_weather
idaho_site_id         
idaho_samples
Introduction to Redshift

Let's practice!

Introduction to Redshift

Preparing Video For Download...