Generating a radially separable dataset

Support Vector Machines in R

Kailash Awati

Instructor

Generating a 2d uniformly distributed set of points

  • Generate a dataset with 200 points
    • 2 predictors x1 and x2, uniformly distributed between -1 and 1.
# Set required number of datapoints
n <- 200
# Set seed to ensure reproducibility
set.seed(42)

# Generate dataframe with 2 predictors x1 and x2 in (-1, 1)
df <- data.frame(x1 = runif(n, min = -1, max = 1),
                 x2 = runif(n, min = -1, max = 1))
Support Vector Machines in R

Create a circular boundary

  • Create a circular decision boundary of radius 0.7 units.
  • Categorical variable y is +1 or -1 depending on the point lies outside or within boundary.
radius <- 0.7
radius_squared <- radius ^ 2

#categorize data points depending on location wrt boundary
df$y <- factor(ifelse(df$x1 ^ 2 + df$x2 ^ 2 < radius_squared, -1, 1),
               levels = c(-1, 1))
Support Vector Machines in R

Plot the dataset

  • Visualize using ggplot.
library(ggplot2)
  • predictors plotted on 2 axes; classes distinguished by color.
# Build plot
p <- ggplot(data = df, aes(x = x1, y = x2, color = y)) + 
     geom_point() + 
     scale_color_manual(values = c("-1" = "red", "1" = "blue")) 
# Display plot 
p
Support Vector Machines in R

Chapter 3.1 - radially separable dataset

Support Vector Machines in R

Adding a circular boundary - Part 1

  • We'll create a function to generate a circle
# Function generates dataframe with points
# lying on a circle of radius r
circle <- 
  function(x1_center, x2_center, r, npoint = 100) {

  # Angular spacing of 2*pi/npoint between points
  theta <- seq(0, 2 * pi, length.out = npoint)
  x1_circ <- x1_center + r * cos(theta)
  x2_circ <- x2_center + r * sin(theta)

  data.frame(x1c = x1_circ, x2c = x2_circ)
}
Support Vector Machines in R

Adding a circular boundary - Part 2

  • To add boundary to plot:
    • generate boundary using circle() function.
    • add boundary to plot using geom_path()
# Generate boundary
boundary <- circle(x1_center = 0,
                   x2_center = 0,
                   r = radius)
# Add boundary to previous plot
p <- p + 
     geom_path(data = boundary,
               aes(x = x1c, y = x2c),
               inherit.aes = FALSE)
# Display plot
p
Support Vector Machines in R

Chapter 3.1 - radially separable dataset with decision boundary

Support Vector Machines in R

Time to practice!

Support Vector Machines in R

Preparing Video For Download...