Data Privacy and Anonymization in Python
Rebeca Gonzalez
Data engineer
from diffprivlib.models import KMeans
# Computing the clusters with the DP model model = KMeans(epsilon=1, n_clusters=3)
# Run the model and obtain clusters clusters = model.fit_predict(X)
StandardScaler
and dimensionality reduction methods like PCA.diffprivlib
just as you would do with sklearn
models.from sklearn.decomposition import PCA
# Initialize PCA pca = PCA()
# Fit transform data with PCA X = pca.fit_transform(X)
# Computing the clusters with the DP model model = dp_Kmeans(epsilon=1, n_clusters=3)
# Run the model and obtain clusters clusters = model.fit_predict(X)
from diffprivlib.models import KMeans as model
# Computing the clusters with the DP model model = dp_Kmeans(epsilon=0.2, n_clusters=3)
# Run the model and obtain clusters clusters = model.fit_predict(X)
Data Privacy and Anonymization in Python