Creating dbt descriptions and tests

Case Study: Building E-Commerce Data Models with dbt

Susan Sun

Freelance Data Scientist

docs: dbt user-defined descriptions

  • User-defined descriptions in yaml files
  • Used to document: dbt models, sources, seeds, data tests, etc.
  • Naming convention:
    _<data source>__<asset>.yml
    
  • Store in the same directory as the assets
looker_ecommerce/
  macros/
  models/
    _looker__models.yml     <------ 
    _looker__sources.yml    <------
    stg_looker__distribution_center.sql
    stg_looker__orders.sql
  seeds/
    looker__distribution_center.csv
Case Study: Building E-Commerce Data Models with dbt

docs: dbt model yaml

Sample _looker__models.yml:

version: 2

models:
  - name: model_name
    description: This is a table
    columns:
      - name: column_name
        description: This is a column
      - name: column_name
        description: This is a column

Note:

  • version: 2 is the schema configuration format used by dbt
  • models defines what asset this user defined is documenting
  • 2 spaces before table name
  • 4 spaces before column name
Case Study: Building E-Commerce Data Models with dbt

dbt data tests: not null and unique

  • Four default data tests live in yaml files:
    • unique
    • not_null
    • accepted_values
    • relationship

1.unique: each row value is unique

- name: table_name
  columns: 
    - name: column_name
      data_tests:
        - not_null
    ...

2.not_null: no row can have a null value

- name: table_name
  columns: 
    - name: column_name
      data_tests:
        - unique
        - not_null
    - name: column_name
      data_tests:
        - unique
Case Study: Building E-Commerce Data Models with dbt

dbt data tests: accepted values

3.accepted_values: only values in list are accepted

- name: table_name
  columns: 
    - name: column_name
      data_tests:
        - accepted_values:
            values: ['value_a', 'value_b', 'value_c', NULL]
Case Study: Building E-Commerce Data Models with dbt

dbt data tests: relationships

4.relationships: referential integrity (foreign key) between tables

- name: table_1
  columns: 
    - name: column_1
      data_tests:
        - relationships:
            to: ref('table_2')
            field: column_2
Case Study: Building E-Commerce Data Models with dbt

Let's practice!

Case Study: Building E-Commerce Data Models with dbt

Preparing Video For Download...