Types of data bias

Conquering Data Bias

Konstantinos Kattidis

Data Analytics Lead

The dynamics of decision making

Person deciding about dinner and career

Adults make approximately 35,000 conscious decisions each day

"What career path should I pursue?"

"What should I eat for dinner?"

  • Heuristics help our brains simplify information processing and reach decisions faster

  • BUT heuristics can cause... cognitive biases

1 https://hbr.org/2023/12/a-simple-way-to-make-better-decisions
Conquering Data Bias

Cognitive biases

Systematic patterns of deviation from norm or rationality in judgment and decision-making processes

  • Data bias can be a result of those cognitive biases

  • For example: an analyst unconsciously favoring positive data while analyzing a recent marketing campaign

Analyst focusing on positive data

Conquering Data Bias

Systemic biases

  • While cognitive biases pertain to individual decision-making processes, systemic biases highlight broader issues guiding data-related activities
  • These are biases that are inherent in the processes, structures, or systems used to collect, analyze, and interpret data
  • They originate due to various reasons such as biased data collection methods and algorithmic design

Woman thoughtful about systemic biases

Conquering Data Bias

Bias in the data lifecycle

Diagram with data lifecycle and biases

  • Systemic and cognitive biases represent the origins of data bias
  • Understanding the various types of data bias is the first step toward building a robust defense against their impact
Conquering Data Bias

Unveiling data collection biases

  • Selection bias
    • The collection process favors certain groups or characteristics over others
  • Historical bias
    • Historical data reflecting past inequalities or systemic issues
  • Measurement bias
    • Instruments or methodologies systematically misrepresent certain attributes

Diagram with biases in data lifecycle, data collection

Conquering Data Bias

Unveiling bias in data analysis

  • Cognitive bias
    • Confirmation bias is one prominent type
    • It refers to tendency to seek and interpret information that confirms pre-existing beliefs
  • Reporting bias
    • Occurs when certain findings are highlighted or suppressed, shaping the narrative around the data

Diagram with biases in data lifecycle, data analysis

Conquering Data Bias

Bias in model development

  • Algorithmic bias

    • Occurs when machine learning models reflect the biases present in the training data
  • Automation bias

    • Emphasizes the importance of human oversight as automated processes may unwittingly perpetuate or amplify existing biases

Diagram with biases in data lifecycle, algorithms

Conquering Data Bias

Let's practice!

Conquering Data Bias

Preparing Video For Download...