HR Analytics: Exploring Employee Data in R
Ben Teusch
HR Analytics Consultant
head(survey)
# A tibble: 6 x 5
employee_id department engagement salary vacation_days_taken
<int> <chr> <int> <dbl> <int>
1 1 Sales 3 103263.64 7
2 2 Engineering 2 80708.64 12
3 4 Engineering 4 60737.05 12
4 5 Engineering 3 99116.32 7
5 7 Engineering 3 51021.64 18
6 8 Engineering 5 98399.87 9
survey %>%
mutate(max_salary = max(salary))
# A tibble: 1,470 x 6
employee_id department engagement salary vacation_days_taken max_salary
<int> <chr> <int> <dbl> <int> <dbl>
1 1 Sales 3 103263.64 7 164072.6
2 2 Engineering 2 80708.64 12 164072.6
3 4 Engineering 4 60737.05 12 164072.6
4 5 Engineering 3 99116.32 7 164072.6
5 7 Engineering 3 51021.64 18 164072.6
# ... with 1,465 more rows
x <- 5
if(x < 10){ "True" } else { "False" }
"True"
z <- c(5, 8, 11, 14)
if(z < 10){ "True" } else { "False" }
"True"
Warning message:
In if (z < 10) { :
the condition has length > 1 and only the first element will be used
ifelse(z < 10, "Yes", "No")
"Yes" "Yes" "No" "No"
survey %>%
mutate(takes_vacation = ifelse(vacation_days_taken > 10, "Yes", "No"))
# A tibble: 1,470 x 6
employee_id engagement salary vacation_days_taken takes_vacation
<int> <int> <dbl> <int> <chr>
1 1 3 103263.64 7 No
2 2 2 80708.64 12 Yes
3 4 4 60737.05 12 Yes
4 5 3 99116.32 7 No
5 7 3 51021.64 18 Yes
# ... with 1,465 more rows
survey %>%
group_by(department) %>%
summarize(max_salary = max(salary))
# A tibble: 3 x 2
department max_salary
<chr> <dbl>
1 Engineering 164072.6
2 Finance 127013.2
3 Sales 143105.5
survey %>%
group_by(department) %>%
summarize(max_salary = max(salary),
min_salary = min(salary),
avg_salary = mean(salary))
# A tibble: 3 x 4
department max_salary min_salary avg_salary
<chr> <dbl> <dbl> <dbl>
1 Engineering 164072.6 45529.69 73576.35
2 Finance 127013.2 45714.07 76651.66
3 Sales 143105.5 46133.67 75073.57
HR Analytics: Exploring Employee Data in R