Collaborative filtering

Building Recommendation Engines in Python

Rob O'Callaghan

Director of Data

Collaborative filtering

Building Recommendation Engines in Python

Collaborative filtering

Building Recommendation Engines in Python

Collaborative filtering

Building Recommendation Engines in Python

Finding similar users

Building Recommendation Engines in Python

Finding similar users

Building Recommendation Engines in Python

Working with real data

user_ratings DataFrame:

User Book Rating
User_233 The Great Gatsby 3.0
User_651 The Catcher in the Rye 5.0
User_131 The Lord of the Rings 3.0
User_965 The Great Gatsby 4.0
User_651 Fifty Shades of Grey 4.0
... ... ...
Building Recommendation Engines in Python

Pivoting our data

user_ratings_pivot = user_ratings.pivot(index='User', 
                                        columns='Book',
                                        values='Rating')
print(user_ratings_pivot)
title     The Great Gatsby    The Catcher in the Rye    Fifty Shades of Grey
User                    
User_233               3.0                       NaN                     NaN
User_651               NaN                       5.0                     4.0
User_965               4.0                       3.0                     NaN
     ...               ...                       ...                     ...
Building Recommendation Engines in Python

Data sparsity

title     The Great Gatsby    The Catcher in the Rye    Fifty Shades of Grey
User                    
User_233               3.0                       NaN                     NaN
User_651               NaN                       5.0                     4.0
User_965               4.0                       3.0                     NaN
     ...               ...                       ...                     ...
print(user_ratings_pivot.dropna())
Empty DataFrame
Columns: ["The Great Gatsby", "The Catcher in the Rye", "Fifty Shades of Grey"]
Index: []
Building Recommendation Engines in Python

Filling the missing values

title     The Great Gatsby    The Catcher in the Rye    Fifty Shades of Grey
User                    
User_233               3.0                       NaN                     NaN
User_651               NaN                       5.0                     4.0
User_965               4.0                       3.0                     NaN
     ...               ...                       ...                     ...
print(user_ratings_pivot["User_651"].fillna(0))
User_651               0.0                       5.0                     4.0
Building Recommendation Engines in Python

Filling the missing values

Building Recommendation Engines in Python

Filling the missing values

avg_ratings = user_ratings_pivot.mean(axis=1)

user_ratings_pivot = user_ratings_pivot.sub(avg_ratings, axis=0)
print(user_ratings_pivot)
title     The Great Gatsby    The Catcher in the Rye    Fifty Shades of Grey
User                    
User_233               0.0                       NaN                     NaN
User_651               NaN                       0.5                    -0.5
User_965               0.5                      -0.5                     NaN
     ...               ...                       ...                     ...

Building Recommendation Engines in Python

Filling the missing values

user_ratings_pivot.fillna(0)
title     The Great Gatsby    The Catcher in the Rye    Fifty Shades of Grey
User                    
User_233               0.0                       0.0                     0.0
User_651               0.0                       0.5                    -0.5
User_965               0.5                      -0.5                     0.0
     ...               ...                       ...                     ...

Building Recommendation Engines in Python

Let's practice!

Building Recommendation Engines in Python

Preparing Video For Download...