Data sources and risks

Data Science for Business

Michael Chow

Data Scientist

Common sources of data

  • Web events
  • Customer data
  • Logistics data
  • Financial transactions
Data Science for Business

Web data

  • Events
  • Timestamps
  • User information

 

user_id event_name timestamp
1234 homepage_visit 2019-01-01 12:01:01
Data Science for Business

Personally Identifiable Information (PII)

    deanonymized-browsing.png

"Jane Doe" = Personally Identifiable Information (PII)

Data Science for Business

Data pseudonymization

pseudoanonymized-browsing.png

  • Restricted access
  • Audit logs
Data Science for Business

Data anonymization

anonymized-browsing.png

Data Science for Business

General Data Protection Regulation (GDPR)

  • Applies to all data inside of the EU
  • Give individuals control over their personal data
  • Regulates how long data can be stored
  • Mandates appropriate anonymization
  • Disclose data collection and gain consent
Data Science for Business

Let's practice!

Data Science for Business

Preparing Video For Download...