Great work!

Data Privacy and Anonymization in Python

Rebeca Gonzalez

Data engineer

You have completed the course

Gif of Leonardo Dicaprio raising a glass in sign of congratulations and celebration

Data Privacy and Anonymization in Python

Recap: What you have learned

  • Sensitive and non-sensitive personally identifiable information (PII)
  • Quasi-identifiers
  • Linkage attacks
  • Data suppression
  • Data masking
  • Data generalization
  • Synthetic data generating
  • Sampling from probability distributions for different type of attributes
Data Privacy and Anonymization in Python

Privacy models: k-anonymity

  • K-anonymous datasets
  • Exploring possible combinations in the dataset
  • Generalizing data using hierarchies and ranges
  • Avoid re-identification attacks
  • Without falsifying or randomizing data!
Data Privacy and Anonymization in Python

Privacy models: differential privacy

  • Differential privacy systems can measure and quantify privacy in data releases
  • One of the most important definitions of privacy in present time
Data Privacy and Anonymization in Python

Differentially private models and operations

  • People are increasingly working with differentially private machine and deep learning models
  • Trained and run different type of differentially private machine learning models!
  • Practiced advanced concepts such as privacy budget and tracking
Data Privacy and Anonymization in Python

Other interesting libraries

$$ $$

  • Google's differential privacy
  • TensorFlow Privacy
  • ARX Data Anonymization Tool

Google Open Source logo

Tensorflow logo

Data Privacy and Anonymization in Python

Congrats!

Data Privacy and Anonymization in Python

Preparing Video For Download...