What do we do with data biases?

Introduction to Data Ethics

Shalini Kurapati, PhD

Co-founder and CEO, Clearbox AI

What is bias?

Illustration of a biased judgement.

  • First thought in data ethics
  • Prejudice, for or against a person, group or an idea- unfair
  • Data ethics- data bias
  • Under, over, or mis-representation
  • Harmful effects if biased data is used for decision-making
1 Icon made by noomtah from www.flaticon.com
Introduction to Data Ethics

Anytime, anywhere

  • At all stages of data and life cycles
  • Data collection
  • Data cleaning, preparation, development
  • Data labeling

Illustration of data biases at different stages of data life cycle feeding into biased algorithms

Introduction to Data Ethics

Data specific biases

  • Technical or statistical process bias

    • Sampling bias- representation
    • Measurement bias
    • Self-reporting bias
    • Labeling bias
  • Human and Systematic bias

    • Gender
    • Ethnicity/minorities
    • Culture

Illustration of a skewed object versus its real image.

1 Icon made by Flowicon from www.flaticon.com
Introduction to Data Ethics

Representation is crucial

Screenshot of the Ted talk of Joy with a photo of her wearing a white mask

  • Lack of representative data
  • Facial recognition- better outcomes with a white mask
  • Hand soap dispensers to healthcare outcomes
  • Recommended: Coded bias
1 https://www.ted.com/talks/joy_buolamwini_how_i_m_fighting_bias_in_algorithms
Introduction to Data Ethics

A mirror to our stereotypes

Illsutration of the different symptoms of a heart attack for a man with chest constriction and a woman with back pain.

  • Medical studies and data, historically done on white males
  • E.g. Heart attack symptoms are different for men and women
  • What happens if we use a diagnostic app using this data?
1 https://womeningh.org/
Introduction to Data Ethics

Serious impact

Screenshot of a diagnosis of a healthcare app for heart attack symptoms for men and women.

  • Heart attack symptoms- emergency for a man, panic attack for a woman
Introduction to Data Ethics

Way too many and counting!

Illustration of the various cognitive biases.

1 CC BY, John Manoogian III (JM3)
Introduction to Data Ethics

Tip of the ice berg

Illustration of a tip of ice berg with statistical biases with the human and systematic biases below

  • Statistical biases tip of ice-berg
  • Human and systematic bias- attitude and societal changes
  • Test for bias, prevention and mitigation, open to feedback
1 Schwartz, R. , Vassilev, A. , Greene, K. , Perine, L. , Burt, A. and Hall, P. (2022), Towards a Standard for Identifying and Managing Bias in Artificial Intelligence, National Institute of Standards and Technology
Introduction to Data Ethics

Let's practice!

Introduction to Data Ethics

Preparing Video For Download...