Welcome to the course!

Unsupervised Learning in R

Hank Roark

Senior Data Scientist at Boeing

Chapter 1 overview

  • Unsupervised learning
  • Three major types of machine learning
  • Execute one type of unsupervised learning using R
Unsupervised Learning in R

Types of machine learning

  • Unsupervised learning
    • Finding structure in unlabeled data
  • Supervised learning
    • Making predictions based on labeled data
    • Predictions like regression or classification
  • Reinforcement learning
Unsupervised Learning in R

Labeled vs. unlabeled data

unlabeled data

1 Sample from Murphy, Machine Learning: A Probabilistic Perspective
Unsupervised Learning in R

Labeled vs. unlabeled data

labeled data

1 Sample from Murphy, Machine Learning: A Probabilistic Perspective
Unsupervised Learning in R

Unsupervised learning - clustering

  • Finding homogeneous subgroups within larger group

People have features such as income, education attainment, and gender

homogeneous group of people

Unsupervised Learning in R

Unsupervised learning - clustering

  • Finding homogeneous subgroups within larger group

$$

two groups of people mixed

Unsupervised Learning in R

Unsupervised learning - clustering

  • Finding homogeneous subgroups within larger group

Clustering

two groups of people separated in two subgroups

Unsupervised Learning in R

Clustering examples

$$

market segmentation notebook

Unsupervised Learning in R

Clustering examples

$$

movies films and popcorn

Unsupervised Learning in R

Unsupervised learning - dimensionality reduction

  • Finding homogeneous subgroups within larger group
    • Clustering
  • Finding patterns in the features of the data
    • Dimensionality reduction
Unsupervised Learning in R

Unsupervised learning - dimensionality reduction

  • Find patterns in the features of the data
  • Visualization of high dimensional data
  • Pre-processing before supervised learning
Unsupervised Learning in R

Challenges and benefits

  • No single goal of analysis
  • Requires more creativity
  • Much more unlabeled data available than cleanly labeled data
Unsupervised Learning in R

Let's practice!

Unsupervised Learning in R

Preparing Video For Download...