Fazit

Explorative Datenanalyse in R

Andrew Bray

Assistant Professor, Reed College

Tortendiagramm vs. Balkendiagramm

ch4_4.002.png

Explorative Datenanalyse in R

Facettieren vs. Stapeln

ch4_4.003.png

Explorative Datenanalyse in R

Histogramm

ggplot(data, aes(x = var1)) +
    geom_histogram()

ch4_4.004.png

Explorative Datenanalyse in R

Dichtediagramm

cars %>%
  filter(eng_size < 2.0) %>%
  ggplot(aes(x = hwy_mpg)) +
  geom_density()

ch4_4.005.png

Explorative Datenanalyse in R

Boxplots nebeneinander

ggplot(common_cyl, aes(x = as.factor(ncyl), y = city_mpg)) +
  geom_boxplot()
Warning message:
Removed 11 rows containing non-finite values (stat_boxplot).

ch4_4.006.png

Explorative Datenanalyse in R

Lage: Mittelwert, Median, Modus

x
76 78 75 74 76 72 74 73 73 75 74
table(x)
x
72 73 74 75 76 78 
 1  2  3  2  2  1

ch4_4.007.png

Explorative Datenanalyse in R

Verteilung von Einkommen

ggplot(life, aes(x = income, fill = west_coast)) +
  geom_density(alpha = .3)
ggplot(life, aes(x = log(income), fill = west_coast)) +
  geom_density(alpha = .3)

ch4_4.008.png

Explorative Datenanalyse in R

Mit group_by()

life %>%
  slice(240:247) %>%
  group_by(west_coast) %>%
  summarize(mean(expectancy))
# A tibble: 2 x 2
  west_coast mean(expectancy)
       <lgl           <dbl>
1      FALSE         79.26125
2       TRUE         79.29375

ch4_4.009.png

Explorative Datenanalyse in R

Spam und Ausrufezeichen

email %>%
  mutate(zero = exclaim_mess == 0) %>%
  ggplot(aes(x = zero, fill = spam)) +
  geom_bar()

ch4_4.010.png

Explorative Datenanalyse in R

Spam und Bilder

email %>%
  mutate(has_image = image 0) %>%
  ggplot(aes(x = as.factor(has_image), fill = spam)) +
  geom_bar(position = "fill")

ch4_4.011.png

Explorative Datenanalyse in R

Lass uns üben!

Explorative Datenanalyse in R

Preparing Video For Download...