Foundations of Inference in Python
Paul Savala
Assistant Professor or Mathematics

investments_df.groupby('market')['funding_total_usd'].mean()
Market        Average funding
===========   ===============
Advertising      13806610
Analytics        14762930
Biotechnology    20838670
...              ...
health_df = investments_df[investments_df['market'] == 'Health and Wellness']
health_df['funding_total_usd'].plot(kind='hist')

health_log = np.log(health_df['funding_total_usd'])health_log.plot(kind='hist')

investments_df['log_funding'] = np.log(investments_df['funding_total_usd'])investments_df.groupby('market')['log_funding'].std()
Advertising            2.254390
Analytics              2.152852
Biotechnology          1.946059
...                    ...
Levene test of equal variance
$H_0:$ Populations have equal variance
$H_a:$ Populations have different variances
from scipy import stats health_df = investments_df[investments_df['market'] == 'Health and Wellness'] analytics_df = investments_df[investments_df['market'] == 'Analytics']s, p_value = stats.levene(health_df['log_funding'], analytics_df['log_funding'])print(p_value < 0.05)
False
Conclusion: Fail to reject null hypothesis. Markets have equal variance in funding.
s, p_value = stats.f_oneway(health_df['log_funding'], analytics_df['log_funding'])print(p_value < 0.05)
True
Conclusion: The markets have statistically significant different funding.
Foundations of Inference in Python