Data Privacy and Anonymization in Python
Rebeca Gonzalez
Data engineer



from diffprivlib.models import KMeans# Computing the clusters with the DP model model = KMeans(epsilon=1, n_clusters=3)# Run the model and obtain clusters clusters = model.fit_predict(X)
StandardScaler and dimensionality reduction methods like PCA.diffprivlib just as you would do with sklearn models.from sklearn.decomposition import PCA# Initialize PCA pca = PCA()# Fit transform data with PCA X = pca.fit_transform(X)# Computing the clusters with the DP model model = dp_Kmeans(epsilon=1, n_clusters=3)# Run the model and obtain clusters clusters = model.fit_predict(X)



from diffprivlib.models import KMeans as model# Computing the clusters with the DP model model = dp_Kmeans(epsilon=0.2, n_clusters=3)# Run the model and obtain clusters clusters = model.fit_predict(X)

Data Privacy and Anonymization in Python