Introduzione alla Data Quality con Great Expectations
Davina Moossazadeh
Data Scientist
expectation = gx.expectations.ExpectTableColumnCountToEqual(
value=10
)
suite = gx.ExpectationSuite(
name="my_suite"
)
# Aggiungi l’Expectation alla Suite
suite.add_expectation(
expectation=expectation
)
# Crea un’altra Expectation Suite
another_suite = gx.ExpectationSuite(name="my_other_suite")
# Aggiungi la stessa Expectation alla nuova Suite
another_suite.add_expectation(expectation=expectation)
Le Expectations non possono appartenere a più Suite contemporaneamente:
RuntimeError: Cannot add Expectation because it already belongs to an
ExpectationSuite. If you want to update an existing Expectation, please call
Expectation.save(). If you are copying this Expectation to a new ExpectationSuite,
please copy it first (the core expectations and some others support
copy(expectation)) and set `Expectation.id = None`.
If you are copying this Expectation to a new ExpectationSuite, please copy it first
(the core expectations and some others support copy(expectation)) and set
`Expectation.id = None`.
Copia l’Expectation, imposta .id su None e aggiungila alla nuova Suite senza errori:
expectation_copy = expectation.copy()expectation_copy.id = None
another_suite.add_expectation(
expectation=expectation_copy
)
print(
expectation_copy in another_suite.expectations
)
True
Aggiungi
.add_expectation()
suite.add_expectation(
expectation=expectation
)
Elimina
.delete_expectation()
suite.delete_expectation(
expectation=expectation
)
Aggiorna l’attributo .value e salva le modifiche:
expectation = gx.expectations.ExpectTableColumnCountToEqual( value=10 )expectation.value = 11expectation.save()
Assicurati che l’Expectation appartenga a una Suite, altrimenti:
RuntimeError: Expectation must be added to ExpectationSuite before it can be saved.
suite = gx.ExpectationSuite(name="my_suite")
validation_definition = gx.ValidationDefinition(
data=batch_definition, suite=suite, name="my_validation_definition"
)
# Definisci l’Expectation col_name_expectation = gx.expectations.ExpectColumnToExist(column="GHI")# Aggiungi l’Expectation alla Suite suite.add_expectation(expectation=col_name_expectation)# Esegui la Validation Definition associata alla Suite validation_results = validation_definition.run()
Salva le modifiche alla Suite prima di eseguire la Validation Definition per evitare errori:
validation_results = validation_definition.run()
ResourceFreshnessAggregateError: ExpectationSuite 'my_suite' has changed since it
has last been saved. Please update with `<SUITE_OBJECT>.save()`, then try your
action again.
Usa il metodo .save() per salvare la Suite ed eseguire la Validation Definition senza errori:
suite.save()validation_results = validation_definition.run()print(validation_results.success)
False
Copia l’Expectation:
expectation_copy = expectation.copy()
expectation_copy.id = None
Verifica se l’Expectation è nella Suite:
expectation in suite.expectations
Elimina l’Expectation:
suite.delete_expectation(expectation)
Aggiorna il valore dell’Expectation:
expectation.value = new_value
Salva le modifiche all’Expectation:
expectation.save()
Salva le modifiche alla Expectation Suite:
suite.save()
Introduzione alla Data Quality con Great Expectations