Case Study: Exploratory Data Analysis in R
Dave Robinson
Chief Data Scientist, DataCamp
summary(model)
Call:
lm(formula = percent_yes ~ year, data = afghanistan)
Residuals:
Min 1Q Median 3Q Max
-0.254667 -0.038650 -0.001945 0.057110 0.140596
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.106e+01 1.471e+00 -7.523 1.44e-08 ***
year 6.009e-03 7.426e-04 8.092 3.06e-09 ***
<hr />
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.08497 on 32 degrees of freedom
Multiple R-squared: 0.6717,\tAdjusted R-squared: 0.6615
F-statistic: 65.48 on 1 and 32 DF, p-value: 3.065e-09
model1 <- lm(percent_yes ~ year, data = afghanistan)
model2 <- lm(percent_yes ~ year, data = united_states)
model3 <- lm(percent_yes ~ year, data = canada)
library(broom)
tidy(model)
term estimate std.error statistic p.value
1 (Intercept) -11.063084650 1.4705189228 -7.523252 1.444892e-08
2 year 0.006009299 0.0007426499 8.091698 3.064797e-09
model1 <- lm(percent_yes ~ year, data = afghanistan) model2 <- lm(percent_yes ~ year, data = united_states)
tidy(model1)
term estimate std.error statistic p.value
1 (Intercept) -11.063084650 1.4705189228 -7.523252 1.444892e-08
2 year 0.006009299 0.0007426499 8.091698 3.064797e-09
tidy(model2)
term estimate std.error statistic p.value
1 (Intercept) 12.664145512 1.8379742715 6.890274 8.477089e-08
2 year -0.006239305 0.0009282243 -6.721764 1.366904e-07
> bind_rows(tidy(model1), tidy(model2))
term estimate std.error statistic p.value
1 (Intercept) -11.063084650 1.4705189228 -7.523252 1.444892e-08
2 year 0.006009299 0.0007426499 8.091698 3.064797e-09
3 (Intercept) 12.664145512 1.8379742715 6.890274 8.477089e-08
4 year -0.006239305 0.0009282243 -6.721764 1.366904e-07
Case Study: Exploratory Data Analysis in R