ALS parameters and hyperparameters

Building Recommendation Engines with PySpark

Jamen Long

Data Scientist at Nike

Example ALS model code

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)
Building Recommendation Engines with PySpark

Column names

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Arguments

  • userCol: Name of column that contains user id's
  • itemCol: Name of column that contains item id's
  • ratingCol: Name of column that contains ratings
Building Recommendation Engines with PySpark

blank original ratings matrix with two blank factor matrices

Building Recommendation Engines with PySpark

same image as previous with latent dimensions highlighted

Building Recommendation Engines with PySpark

Rank

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Hyperparameters

  • rank, $k$: number of latent features
Building Recommendation Engines with PySpark

MaxIter

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Hyperparameters

  • rank, $k$: number of latent features
  • maxIter: number of iterations
Building Recommendation Engines with PySpark

RegParam

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Hyperparameters

  • rank, $k$: number of latent features
  • maxIter: number of iterations
  • regParam: Lambda
Building Recommendation Engines with PySpark

Alpha

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Hyperparameters

  • rank, $k$: number of latent features
  • maxIter: number of iterations
  • regParam: Lambda
  • alpha: Discussed later. Only used with implicit ratings.
Building Recommendation Engines with PySpark

Non-negative

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Additional Arguments

  • nonnegative = True: Ensures positive numbers
Building Recommendation Engines with PySpark

Cold start strategy

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Additional Arguments

  • nonnegative = True: Ensures positive numbers
  • coldStartStrategy = "drop": Addresses issues with test/train split
Building Recommendation Engines with PySpark

Implicit preferences

als_model = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, alpha=40,
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)

Additional Arguments

  • nonnegative = True: Ensures positive numbers
  • coldStartStrategy = "drop": Addresses issues with test/train split
  • implicitPrefs = True: True/False depending on ratings type
Building Recommendation Engines with PySpark

Sample ALS model build

als = ALS(userCol="userId", itemCol="movieId", ratingCol="rating", 
            rank=25, maxIter=100, regParam=.05, 
            nonnegative=True,
            coldStartStrategy="drop",
            implicitPrefs=False)
Building Recommendation Engines with PySpark

Fit and transform methods

# Fit ALS to training dataset
model = als.fit(training_data)

# Generate predictions on test dataset
predictions = model.transform(test_data)
Building Recommendation Engines with PySpark

Let's practice!

Building Recommendation Engines with PySpark

Preparing Video For Download...