Introducing the data

Exploratory Data Analysis in R

Andrew Bray

Assistant Professor, Reed College

Email dataset

email
# A tibble: 3,921 × 21
       spam to_multiple  from    cc sent_email                time image
     <fct>       <dbl> <dbl> <int>      <dbl>              <dttm> <dbl>
1  not-spam           0     1     0          0 2012-01-01 01:16:41     0
2  not-spam           0     1     0          0 2012-01-01 02:03:59     0
3  not-spam           0     1     0          0 2012-01-01 11:00:32     0
4  not-spam           0     1     0          0 2012-01-01 04:09:49     0
5  not-spam           0     1     0          0 2012-01-01 05:00:01     0
6  not-spam           0     1     0          0 2012-01-01 05:04:46     0
7  not-spam           1     1     0          1 2012-01-01 12:55:06     0
8  not-spam           1     1     1          1 2012-01-01 13:45:21     1
9  not-spam           0     1     0          0 2012-01-01 16:08:59     0
10 not-spam           0     1     0          0 2012-01-01 13:12:00     0
# ... with 3,911 more rows, and 14 more variables: attach <dbl>,
#   dollar <dbl>, winner <fct>, inherit <dbl>, viagra <dbl>,
#   password <dbl>, num_char <dbl>, line_breaks <int>, format <dbl>,
#   re_subj <dbl>, exclaim_subj <dbl>, urgent_subj <dbl>,
#   exclaim_mess <dbl>, number <fct>
Exploratory Data Analysis in R

Histograms

ggplot(data, aes(x = var1)) +
  geom_histogram()

ch4_1_v2.005.png

Exploratory Data Analysis in R

Histograms

ggplot(data, aes(x = var1)) +
  geom_histogram() +
  facet_wrap(~var2)

ch4_1_v2.007.png

Exploratory Data Analysis in R

Boxplots

ggplot(data, aes(x = var2, y = var1)) +
  geom_boxplot()

ch4_1_v2.009.png

Exploratory Data Analysis in R

Boxplots

ggplot(data, aes(x = 1, y = var1)) +
  geom_boxplot()

ch4_1_v2.011.png

Exploratory Data Analysis in R

Density plots

ggplot(data, aes(x = var1)) +
  geom_density()

ch4_1_v2.013.png

Exploratory Data Analysis in R

Density plots

ggplot(data, aes(x = var1, fill = var2)) +
  geom_density(alpha = .3)

ch4_1_v2.015.png

Exploratory Data Analysis in R

Let's practice!

Exploratory Data Analysis in R

Preparing Video For Download...