Data preparation for kNN

Supervised Learning in R: Classification

Brett Lantz

Instructor

kNN assumes numeric data

Dummy Coding Example

Supervised Learning in R: Classification

kNN benefits from normalized data

Before Normalization

After Normalization

Supervised Learning in R: Classification

Normalizing data in R

# define a min-max normalize() function
normalize <- function(x) {
  return((x - min(x)) / (max(x) - min(x)))
}
# normalized version of r1
summary(normalize(signs$r1))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0000  0.1935  0.3528  0.4046  0.6129  1.0000
# un-normalized version of r1
summary(signs$r1)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    3.0    51.0    90.5   103.3   155.0   251.0
Supervised Learning in R: Classification

Let's practice!

Supervised Learning in R: Classification

Preparing Video For Download...