Descriptive statistics in survey analysis

Analyzing Survey Data in Python

EbunOluwa Andrew

Data Scientist

What are descriptive statistics in survey analysis?

  • Basic measures used to describe survey data.
  • Consist of descriptions of single variables and associated survey sample.

magnifying glass and pen over graph

Analyzing Survey Data in Python

Why use descriptive statistics?

  • Allow data to summarized clearly
  • Forms
    • Tabular presentations
    • Visualizations
  • Help identify outliers

Graphs with magnifying

Analyzing Survey Data in Python

Frequency and distributions

  • Grouped data based on number of occurrences in each class
  • On qualitative and quantitative data
  • Count of different outcomes in a raw survey dataset
  • Bar charts, histograms, pie charts, line charts, etc

Looking at the statistical data, graphs, and charts

Analyzing Survey Data in Python

Central tendency: mean, median, mode

  • Single value reflecting center of data distribution
  • Mean = average value
  • Median = middle score for dataset in ascending order
  • Mode = most frequent value in dataset
Analyzing Survey Data in Python

Measures of variability

  • Determines how far apart data points appear to fall from center
  • Range
    • Distance between highest and lowest values
  • Standard deviation
    • Average variance
    • Insight into distance between value in a dataset and mean value

Crowd from above forming a growth graph with lines connecting between them

Analyzing Survey Data in Python

Survey: dietary_habits

dietary_habits.head()
| Age   | Gender | meals_per_day | eat_out_per_wk |
|-------|--------|---------------|----------------|
| 18-24 | Male   |             5 |              4 |
| 18-24 | Male   |             4 |              1 |
| 45-54 | Male   |             3 |              3 |
| 18-24 | Male   |             2 |              1 |
| 18-24 | Female |             3 |              1 |
Analyzing Survey Data in Python

Frequency distribution: dietary_habits

dietary_habits.Gender.value_counts().to_frame("Number")
|        | Number |
|--------|--------|
| Male   | 40     |
| Female | 38     |

Index: Gender

Analyzing Survey Data in Python

Frequency distribution: dietary_habits

dietary_habits.Gender.value_counts().to_frame("Number").plot(kind='bar')

gender frequency distribution bar plot

Analyzing Survey Data in Python

Measures of central tendency: dietary_habits

  • .mean()
  • .median()
  • .mode()
Analyzing Survey Data in Python

Measures of central tendency: dietary_habits

  • .mean()
dietary_habits.mean()
| meals_per_day  | 3.128205 |
| eat_out_per_wk | 1.897436 |
| dtype: float64 |          |
Analyzing Survey Data in Python

Measures of central tendency: dietary_habits

  • .median()
dietary_habits.median()
| meals_per_day  | 3.0 |
| eat_out_per_wk | 1.5 |
| dtype: float64 |     |
Analyzing Survey Data in Python

Measures of central tendency: dietary_habits

  • .mode()
dietary_habits.mode()
| Age   | Gender | meals_per_day | eat_out_per_wk |
|-------|--------|---------------|----------------|
| 18-24 | Male   |             3 |              1 |
Analyzing Survey Data in Python

Measures of variability: dietary_habits

print(dietary_habits.meals_per_day.max() - dietary_habits.meals_per_day.min())
3
print(dietary_habits.meals_per_day.std())
0.6518500018473766
Analyzing Survey Data in Python

Let's practice!

Analyzing Survey Data in Python

Preparing Video For Download...