Exploring numerical data

Exploratory Data Analysis in R

Andrew Bray

Assistant Professor, Reed College

Cars dataset

str(cars)
'data.frame':    428 obs. of  19 variables:
 $ name       : chr  "Chevrolet Aveo 4dr" "Chevrolet Aveo LS 4dr hatch" "Chevrolet Cavalier 2dr" ...
 $ sports_car : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ suv        : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ wagon      : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ minivan    : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ pickup     : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ all_wheel  : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ rear_wheel : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ msrp       : int  11690 12585 14610 14810 16385 13670 15040 13270 13730 15460 ...
 $ dealer_cost: int  10965 11802 13697 13884 15357 12849 14086 12482 12906 14496 ...
 $ eng_size   : num  1.6 1.6 2.2 2.2 2.2 2 2 2 2 2 ...
 $ ncyl       : int  4 4 4 4 4 4 4 4 4 4 ...
 $ horsepwr   : int  103 103 140 140 140 132 132 130 110 130 ...
 $ city_mpg   : int  28 28 26 26 26 29 29 26 27 26 ...
 $ hwy_mpg    : int  34 34 37 37 37 36 36 33 36 33 ...
 $ weight     : int  2370 2348 2617 2676 2617 2581 2626 2612 2606 2606 ...
 $ wheel_base : int  98 98 104 104 104 105 105 103 103 103 ...
 $ length     : int  167 153 183 183 183 174 174 168 168 168 ...
 $ width      : int  66 66 69 68 69 67 67 67 67 67 ...
Exploratory Data Analysis in R

Dotplot

ggplot(data, aes(x = weight)) +
  geom_dotplot(dotsize = 0.4)

ch2_1.005.png

Exploratory Data Analysis in R

Histogram

ggplot(data, aes(x = weight)) +
  geom_histogram()

ch2_1.007.png

Exploratory Data Analysis in R

Density plot

ggplot(data, aes(x = weight)) +
  geom_density()

ch2_1.009.png

Exploratory Data Analysis in R

Density plot

ggplot(data, aes(x = weight)) +
  geom_density()

ch2_1.010.png

Exploratory Data Analysis in R

Density plot

ggplot(data, aes(x = weight)) +
  geom_density()

ch2_1.012.png

Exploratory Data Analysis in R

Boxplot

ggplot(data, aes(x = 1, y = weight)) +
  geom_boxplot() + 
  coord_flip()

ch2_1.015.png

Exploratory Data Analysis in R

Boxplot

ggplot(data, aes(x = 1, y = weight)) +
  geom_boxplot() + 
  coord_flip()

ch2_1.016.png

Exploratory Data Analysis in R

Boxplot

ggplot(data, aes(x = 1, y = weight)) +
  geom_boxplot() + 
  coord_flip()

ch2_1.017.png

Exploratory Data Analysis in R

Boxplot

ggplot(data, aes(x = 1, y = weight)) +
  geom_boxplot() + 
  coord_flip()

ch2_1.018.png

Exploratory Data Analysis in R

Faceted histogram

ggplot(cars, aes(x = hwy_mpg)) +
  geom_histogram() +
  facet_wrap(~pickup)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning message:
Removed 14 rows containing non-finite values (stat_bin).

ch2_1.022.png

Exploratory Data Analysis in R

Faceted histogram

ggplot(cars, aes(x = hwy_mpg)) +
  geom_histogram() +
  facet_wrap(~pickup)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning message:
Removed 14 rows containing non-finite values (stat_bin).

ch2_1.023.png

Exploratory Data Analysis in R

Faceted histogram

ggplot(cars, aes(x = hwy_mpg)) +
  geom_histogram() +
  facet_wrap(~pickup)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning message:
Removed 14 rows containing non-finite values (stat_bin).

ch2_1.024.png

Exploratory Data Analysis in R

Let's practice!

Exploratory Data Analysis in R

Preparing Video For Download...