Metadata and data quality

Introduction to Data Quality

Chrissy Bloom

Head of Enterprise Data Strategy & Governance

What is metadata?

Metadata: data about data, or attributes that describe data

  • Used to organize and understand datasets and data elements
  • Used in the data quality process to determine the:
    • definition of a field
    • owner of a field
    • field's last update date

examples of metadata definition, data owner, update date

Introduction to Data Quality

Metadata examples

Metadata can be found in a data dictionary.

Examples:

  • Business field name
  • Business definition
  • Data owner
  • Technical physical field name

example of metadata in data catalog

Introduction to Data Quality

What is data lineage?

Data lineage: A representation of how data moves in a pipeline, from where the data is entered in the source through each step in the data pipeline, until it is consumed.

example of data lineage

  • Each layer has its own metadata
  • Used in the data quality process to determine where to implement a data quality rule
Introduction to Data Quality

Data lineage example

detailed example of data lineage

Introduction to Data Quality

Metadata and data lineage example

data lineage example

Introduction to Data Quality

Metadata and data lineage example bad practice

data lineage example

Introduction to Data Quality

Metadata and data lineage examplebest practice

data lineage example

Introduction to Data Quality

Let's practice!

Introduction to Data Quality

Preparing Video For Download...