Common data mistakes

Introduction to Data

Maarten Van den Broeck

Senior Content Developer at DataCamp

Common mistakes about data

An error while working with data

Introduction to Data

Common mistakes about data

  • Not having a clear goal or question

Icons representing an error while working with data due to a poorly defined problem

Introduction to Data

Common mistakes about data

  • Not having a clear goal or question
  • Insufficient or wrong data

Icons representing an error while working with data due to a poorly defined problem and wrong data

Introduction to Data

Common mistakes about data

  • Not having a clear goal or question
  • Insufficient or wrong data
  • Lack of appropriate analysis

Icons representing an error while working with data due to a poorly defined problem, and wrong data and statistics

Introduction to Data

Common mistakes about data

  • Not having a clear goal or question
  • Insufficient or wrong data
  • Lack of appropriate analysis
  • No clear communication of results

$$

Carefully plan the data analysis process

Icons representing an error while working with data due to a poorly defined problem, and wrong data, statistics, and communication

Introduction to Data

Not clearly defining the problem

"Did you buy anything in the last month?"

$$

"Where did you make your last purchase?"

"Which payment method did you use?"

May lead to inappropriate data collection, analysis, and conclusions

defining a data question

Introduction to Data

Insufficient or wrong data

wrong data

$$

$$

Data bias: the data sample doesn't represent all the data

  • Collecting the wrong data doesn't allow you to answer the research question
  • Data still needs cleaning before analysis
Introduction to Data

Lack of appropriate analysis

$$

  • Jumping to conclusions too quickly
  • Lack of context: a missing reason explaining the results
  • Other examples include
    • Incorrect aggregations and calculations
    • Confusing correlation with causation

poor data analysis

Introduction to Data

No clear communication of results

data communication

$$

  • Most valuable part of data life cycle
  • Could lead to misunderstandings or incorrect conclusions
  • Examples:
    • Too technical
    • Cherry-picking data points
    • Unclear visualizations
Introduction to Data

Let's practice!

Introduction to Data

Preparing Video For Download...