Conclusie

Exploratory Data Analysis in R

Andrew Bray

Assistant Professor, Reed College

Cirkeldiagram vs. staafdiagram

ch4_4.002.png

Exploratory Data Analysis in R

Facetting vs. stapelen

ch4_4.003.png

Exploratory Data Analysis in R

Histogram

ggplot(data, aes(x = var1)) +
    geom_histogram()

ch4_4.004.png

Exploratory Data Analysis in R

Dichtheidsplot

cars %>%
  filter(eng_size < 2.0) %>%
  ggplot(aes(x = hwy_mpg)) +
  geom_density()

ch4_4.005.png

Exploratory Data Analysis in R

Boxplots naast elkaar

ggplot(common_cyl, aes(x = as.factor(ncyl), y = city_mpg)) +
  geom_boxplot()
Warning message:
Removed 11 rows containing non-finite values (stat_boxplot).

ch4_4.006.png

Exploratory Data Analysis in R

Centrum: gemiddelde, mediaan, modus

x
76 78 75 74 76 72 74 73 73 75 74
table(x)
x
72 73 74 75 76 78 
 1  2  3  2  2  1

ch4_4.007.png

Exploratory Data Analysis in R

Vorm van inkomen

ggplot(life, aes(x = income, fill = west_coast)) +
  geom_density(alpha = .3)
ggplot(life, aes(x = log(income), fill = west_coast)) +
  geom_density(alpha = .3)

ch4_4.008.png

Exploratory Data Analysis in R

Met group_by()

life %>%
  slice(240:247) %>%
  group_by(west_coast) %>%
  summarize(mean(expectancy))
# A tibble: 2 x 2
  west_coast mean(expectancy)
       <lgl           <dbl>
1      FALSE         79.26125
2       TRUE         79.29375

ch4_4.009.png

Exploratory Data Analysis in R

Spam en uitroeptekens

email %>%
  mutate(zero = exclaim_mess == 0) %>%
  ggplot(aes(x = zero, fill = spam)) +
  geom_bar()

ch4_4.010.png

Exploratory Data Analysis in R

Spam en afbeeldingen

email %>%
  mutate(has_image = image 0) %>%
  ggplot(aes(x = as.factor(has_image), fill = spam)) +
  geom_bar(position = "fill")

ch4_4.011.png

Exploratory Data Analysis in R

Laten we oefenen!

Exploratory Data Analysis in R

Preparing Video For Download...