dbt sources

Intermediate dbt

Mike Metzger

Data Engineer

What is a dbt source?

  • Name and description of data loaded by EL process
  • Helps define data lineage
  • Tests
  • Documentation

Data source

1 Photo by Mika Baumeister on Unsplash
Intermediate dbt

Sources

  • Provides data lineage information
  • Describes the flow of data in warehouse
  • Use the Jinja {{ source() }} function
    • Similar to the {{ ref() }} function
  • Simplifies accessing raw data
select * 
from
  {{ source('raw', 'orders') }}
Intermediate dbt

Defining a source

  • In the .yml file
  • File can be models/model_properties.yml
    • Or any other .yml file in the directory
  • Under the sources: section
  • Name the source with the - name: option
  • Define each source table with a - name: option under tables:
  • Different options available depending on the data warehouse type, refer to dbt documentation.
version: 2

sources:
  - name: raw
    tables:
      - name: phone_orders
      - name: web_orders
Intermediate dbt

Accessing sources

  • Use the {{ source() }}
  • {{ source(source_name, table_name) }}
  • Provides the proper name of the table in the compiled query
select * from
  {{ source('raw', 'phone_orders') }}
UNION
select * from
  {{ source('raw', 'web_orders') }}
-- dbt compiled
select * from
  'raw'.'phone_orders'
UNION
select * from
  'raw'.'web_orders'
Intermediate dbt

Testing sources

  • You can apply tests to sources
  • Same methods as applying to models
  • Defined in the sources: section instead of models:
  • In .yml file where sources defined
version: 2

sources:
  - name: raw
    tables:
      - name: phone_orders
        columns:
          - name: id
            tests:
              - not_null
              - unique
      - name: web_orders
Intermediate dbt

Let's practice!

Intermediate dbt

Preparing Video For Download...