Introduction to Data Quality with Great Expectations
Davina Moossazadeh
Data Scientist
GX components - Python classes that represent data and data validation entities
Data Source - An object that tells GX how to connect to a specific source of external data
Data Source - An object that tells GX how to connect to a specific source of external data
Manage Data Sources with the .data_sources
attribute, using the .add_pandas()
method:
data_source = context.data_sources.add_pandas(
name="my_pandas_datasource" )
Note: The name
parameter in GX is different from the Python variable name
"my_pandas_datasource"
vs. data_source
Data Asset - A collection of records within a Data Source
data_asset = data_source.add_dataframe_asset(
name="my_dataframe_asset" )
Create Data Source from Data Context:
data_source = context.data_sources.add_pandas(
name: str
)
Create Data Asset from Data Source:
data_asset = data_source.add_dataframe_asset(
name: str
)
Introduction to Data Quality with Great Expectations