Data Privacy and Anonymization in Python
Rebeca Gonzalez
Instructor
Making the same private query with $\epsilon$ = 1 twice, it's like making a query with privacy $\epsilon$ = 2
Third-parties can average answers together, filtering out the noise.
Remember that epsilon is exponential.
from diffprivlib import BudgetAccountant
acc = BudgetAccountant(epsilon=5) acc
BudgetAccountant(epsilon=5)
# Compute a private mean of the salaries using epsilon of 0.5 # Use the Budget Accountant acc and set bounds to be from 0 to 230000 dp_mean = tools.mean(salaries, epsilon=0.5, accountant=acc, bounds=(0, 230000))
# Print the resulting private mean print("Private mean: ", dp_mean)
Private mean: 82524.72611901595
# Total privacy spent print("Total spent: ", acc.total())
# Privacy budget remaining print("Remaining budget: ", acc.remaining())
# Total number of queries done so far print("Number of queries recorded: ", len(acc))
Total spent: (epsilon=0.5, delta=0.0)
Remaining budget: (epsilon=4.5, delta=1.0)
Number of queries recorded: 1
# Privacy budget remaining for 2 queries
print("Remaining budget for 2 queries: ", acc.remaining(2))
Remaining budget for 2 queries: (epsilon=2.25, delta=1.0)
Data Privacy and Anonymization in Python