Generating hypotheses

Exploratory Data Analysis in Python

George Boorman

Curriculum Manager, DataCamp

What do we know?

Countplot showing the number of flights per airline in different price categories, with Jet Airways having the largest number of First Class tickets

Exploratory Data Analysis in Python

What do we know?

sns.heatmap(planes.corr(), annot=True)
plt.show()

Heatmap showing Pearson correlation coefficient scores between variables in the planes dataset

Exploratory Data Analysis in Python

Spurious correlation

sns.scatterplot(data=planes, x="Duration", y="Price", hue="Total_Stops")
plt.show()

Scatter plot of Price versus Duration, factoring Total Stops

Exploratory Data Analysis in Python

How do we know?

Heatmap with correlation coefficient scores for each number of stops

Exploratory Data Analysis in Python

What is true?

Typewriter displaying "Fake News"

  • Would data from a different time give the same results?

  • Detecting relationships, differences, and patterns:

    • We use Hypothesis Testing
  • Hypothesis testing requires, prior to data collection:

    • Generating a hypothesis or question
    • A decision on what statistical test to use
1 Image credit: https://unsplash.com/@markuswinkler
Exploratory Data Analysis in Python

Data snooping

 

office with a view looking out on to an airport runway

Magnifying glass looking into a bar chart

Exploratory Data Analysis in Python

Generating hypotheses

sns.barplot(data=planes, x="Airline", y="Duration")
plt.show()

Bar plot of duration versus airline

Exploratory Data Analysis in Python

Generating hypotheses

sns.barplot(data=planes, x="Destination", y="Price")
plt.show()

Bar plot showing average proce

Exploratory Data Analysis in Python

Next steps

  • Design our experiment

  • Involves steps such as:

    • Choosing a sample
    • Calculating how many data points we need
    • Deciding what statistical test to run
Exploratory Data Analysis in Python

Let's practice!

Exploratory Data Analysis in Python

Preparing Video For Download...