Getting the right data

Strategia per l'Intelligenza Artificiale (AI)

Vidhi Chugh

AI strategist and ethicist

Data availability

  • More data -> better model outcomes

Model outcome

  • Not just the data quantity
  • Labeling, quality, and timeliness
  • Model captures underlying patterns

Underlying patterns

  • Data-centric science
  • Systematically engineering the data
Strategia per l'Intelligenza Artificiale (AI)

Data relevance

 

  • Richness and relevance of patterns
  • Irrelevant data misleads the model -> lowers accuracy

 

  • Example: credit scoring model
    • Relevant attributes
    • Transactions history
    • Assets profile

 

Assets profile

Strategia per l'Intelligenza Artificiale (AI)

Time relevancy

  • Supply chain dynamics have changed post pandemic

Supply chain

Strategia per l'Intelligenza Artificiale (AI)

Data privacy

 

  • Sensitive user data

Sensitive data

  • Data privacy standards, such as, GDPR

 

  • Ethical practices

Ethical practices

  • Enables user trust
Strategia per l'Intelligenza Artificiale (AI)

Data dictionary

 

  • Gives the meaning of different data fields and their significance

 

  • Domain experts link data with business decisions

 

  • Redundant data does not add value to model predictions

 

Data dictionary

Strategia per l'Intelligenza Artificiale (AI)

Data sampling

Time

  • When working with more data
  • Create a sampled dataset
  • Comparable results while being economical

Budget

Data sampling

Strategia per l'Intelligenza Artificiale (AI)

Data augmentation

Wait

  • But, what if data is not sufficient?
  • Augment data - create new records from existing datasets

Lost opportunity

Augmenting new images

Strategia per l'Intelligenza Artificiale (AI)

Data diversity

 

  • Create reliable models

  • Good model accuracy

  • Example: Loan application

    • Include different age groups and ethnicities

 

Loan application

Strategia per l'Intelligenza Artificiale (AI)

Data quality

 

  • Complete and comprehensive
  • Accurate data; age and date of birth

Complete and comprehensive

 

  • Missing data
  • Correct data labels

Missing data

Strategia per l'Intelligenza Artificiale (AI)

Let's practice!

Strategia per l'Intelligenza Artificiale (AI)

Preparing Video For Download...