Introducing Survey Data Analysis

Analyzing Survey Data in Python

EbunOluwa Andrew

Data Scientist

What is survey data analysis?

  • Gain insight
    • More results = more understanding
  • Measure effects
    • High response rate is critical
    • Example: response to follow-up survey indicates if changes to product is effective

We Want Your Feedback Banner

Analyzing Survey Data in Python

What is survey data?

  • Data responses from research participants
  • Quantitative or numerical
  • Qualitative or descriptive
  • Fair representation of audience' opinions

Photo by Celpax on Unsplash - woman taking survey

1 Photo by Celpax on Unsplash
Analyzing Survey Data in Python

Types of survey data

  • Ordinal
    • Responses make sense as an order
    • Sample question: How much do you use our product?
    • Sample answers: Never, Rarely, Sometimes, Often, Always
    • Order matters

Filling questionnaire form

Analyzing Survey Data in Python

Types of survey data

  • Nominal
    • Different groups/categories
    • No order between the categories
    • Examples: city of birth, gender, ethnicity

Birth certificate

Analyzing Survey Data in Python

Types of survey data

  • Interval
    • Ordered data
    • Distance between values = meaningful and equal
    • Sample answer options: 22C, 24C, or 26C

Thermometer - Photo by Bianca Ackermann on Unsplash

  • Ratio
    • Precise measurements
    • True zero point
    • Sample answer: $5321

House made from one hundred dollars-Photo by Kostiantyn Li on Unsplash

Analyzing Survey Data in Python

Defining goals

  • Define research goals
  • Response rate
  • Learn from feedback

unrepresentative sample

Analyzing Survey Data in Python

Sampling for surveys

  • Impossible to collect from large population
  • Sampling used for estimates or test hypotheses about population
  • Different sampling techniques used to create unbiased samples

Photo by Joseph Chan on Unsplash - people holding umbrella on road at daytime

1 Photo by Joseph Chan on Unsplash
Analyzing Survey Data in Python

Sampling techniques overview

  • Simple random
    • Randomly selected subgroup of a population
  • Stratified random
    • Population divided into stratum (eg: race, gender, etc)
    • Subgroup randomly selected from each stratum

Analytics Vidhya - stratified random sampling

Analyzing Survey Data in Python

Sampling techniques overview

  • Weighted
    • Select subgroup that matches population proportions
  • Cluster
    • Divides population into clusters (eg: schools, cities, etc)
    • Subgroup randomly selected from clusters

Aoife Dalton on SlideServe - cluster sampling illustration

1 Aoife Dalton on SlideServe
Analyzing Survey Data in Python

Crosstab - a common way to analyze survey data

  • .crosstab() -> examines the inter-relationship between two nominal variables.
print(survey.head())
| Age     | Occupation_Title          | Current Student | Gender | Education  |
|---------|---------------------------|-----------------|--------|------------|
| 18 - 24 | Credit officer            | No              | Female | Bachelor's |
| 18 - 24 | Student                   | Yes, Full-Time  | Male   | Bachelor's |
| 18 - 24 | Student                   | Yes, Full-Time  | Female | Bachelor's |
| 25 - 34 | Senior Financial Analyst  | No              | Female | Bachelor's |
| 35 - 44 | Public Relations Director | No              | Female | Bachelor's |
Analyzing Survey Data in Python

Crosstab function

cross_tabulation = pd.crosstab(survey.Age, survey.Gender)
cross_tabulation
|         | Female | Male |
|---------|--------|------|
| 18 - 24 |     39 |   12 |
| 25 - 34 |     28 |   12 |
| 35 - 44 |      5 |    1 |
| 45 - 54 |      3 |    0 |
Analyzing Survey Data in Python

Let's practice!

Analyzing Survey Data in Python

Preparing Video For Download...