Data sanitization and masking

Using Data Stores in AWS

Dunieski Otano

AWS Solutions Architect

The log file that exposed everything

  • Application logs user activity for debugging
  • Logs contain full national IDs, credit cards, passwords
  • Logs sent to CloudWatch (accessible to dev team)
  • Compliance audit finds violation: $2M fine

removed

Using Data Stores in AWS

What is data sanitization?

  • Definition
    • Removing or obscuring sensitive data
  • When to sanitize
    • Before logging
    • Before displaying to users
    • Before transmitting to third parties
  • Goal
    • Maintain utility while protecting privacy

sanitize

Using Data Stores in AWS

Full masking technique

  • Use for
    • Passwords, API keys, tokens
  • Implementation
    • Replace all characters with asterisks
  • Example
    • "myPassword123" → "***"

mask

Using Data Stores in AWS

Partial masking technique

  • Use for
    • National IDs, credit cards, phone numbers
  • Implementation
    • Show last 4 digits, mask the rest
  • Example
    • "123-45-6789" ==> "*--6789"
    • "4532-1234-5678-9010" ==> "--**-9010"

partial

Using Data Stores in AWS

Hashing and tokenization

  • Hashing
  • Tokenization
    • Replace with random token
    • Store mapping securely
  • Use cases
    • Password verification, duplicate detection, payments, compliance

hash

Using Data Stores in AWS

Redaction technique

  • Use for
    • Medical diagnoses, legal information
  • Implementation
    • Completely remove sensitive content
  • Example
    • "Patient has diabetes" ==> "Patient has [REDACTED]"

redaction

Using Data Stores in AWS

Implementing masking functions

  • Create utility library
    • mask_national_id(), mask_credit_card(), mask_email(), mask_phone()
  • Apply consistently
    • Before logging, displaying, transmitting
  • Never store masked data
    • Store original encrypted, mask on output

output

Using Data Stores in AWS

Compliance and testing

  • GDPR requirements
    • Data minimization, purpose limitation
  • HIPAA requirements
    • Minimum necessary standard
  • Testing
    • Test with real patterns and edge cases

compliance

Using Data Stores in AWS

Let's practice!

Using Data Stores in AWS

Preparing Video For Download...