Exploratory Data Analysis in R
Andrew Bray
Assistant Professor, Reed College
comics
# A tibble: 23,272 x 11
name id align
<fct> <fct> <fct>
1 Spider-Man (Peter Parker) Secret Identity Good
2 Captain America (Steven Rogers) Public Identity Good
3 Wolverine (James \\"Logan\\" Howlett) Public Identity Neutral
4 Iron Man (Anthony \\"Tony\\" Stark) Public Identity Good
5 Thor (Thor Odinson) No Dual Identity Good
6 Benjamin Grimm (Earth-616) Public Identity Good
7 Reed Richards (Earth-616) Public Identity Good
8 Hulk (Robert Bruce Banner) Public Identity Good
9 Scott Summers (Earth-616) Public Identity Neutral
10 Jonathan Storm (Earth-616) Public Identity Good
# ... with 23,262 more rows, and 8 more variables: eye <fct>,
# hair <fct>, gender <fct>, gsm <fct>, alive <fct>,
# appearances <int>, first_appear <fct>, publisher <fct>
levels(comics$align)
"Bad" "Good" "Neutral"
"Reformed Criminals"
levels(comics$id)
"No Dual" "Public" "Secret" "Unknown" # Note: NAs ignored by levels() function
table(comics$id, comics$align)
Bad Good Neutral Reformed Criminals
No Dual 474 647 390 0
Public 2172 2930 965 1
Secret 4493 2475 959 1
Unknown 7 0 2 0
library(ggplot2) # Load package
ggplot(comics, aes(x = id, fill = align)) +
geom_bar()
Exploratory Data Analysis in R