Analyzing Survey Data in Python
EbunOluwa Andrew
Data Scientist
yp_survey
| Gender | Age | Entertainment |
|--------|-----|---------------|
| female | 19 | Agree |
| female | 20 | Agree |
| male | 19 | Agree |
| female | 19 | Disagree |
| female | 24 | Disagree |
| male | 18 | Agree |
| male | 18 | Agree |
| male | 20 | Disagree |
import pandas as pd
yp_crosstab = pd.crosstab(
yp_survey['Gender'],
yp_survey['Entertainment'])
yp_crosstab
yp_crosstab.plot.barh()
| | Agree | Disagree |
|--------|-------|----------|
| female | 81 | 84 |
| male | 84 | 46 |
survey = yp_survey.groupby(['Gender', 'Entertainment'])['Age'].count().reset_index()
survey.columns = ['Gender', 'Entertainment', 'Respondents'] survey
| Gender | Entertainment | Respondents |
|--------|---------------|-------------|
| female | Agree | 81 |
| female | Disagree | 84 |
| male | Agree | 84 |
| male | Disagree | 46 |
survey['% total respondents'] = survey.Respondents * 100./survey.Respondents.sum()
survey['% of population'] = [35, 25, 20, 20]
survey['Weight'] = survey['% of population']/survey['% total respondents']
survey['Weighted Respondents'] = survey.Weight * survey.Respondents
survey[['Gender','Entertainment',
'Respondents','Weighted Respondents']].set_index(
['Gender','Entertainment']).plot.barh()
Analyzing Survey Data in Python