Niet-gepersonaliseerde aanbevelingen

Aanbevelingssystemen bouwen in Python

Rob O'Callaghan

Director of Data

Niet-gepersonaliseerde beoordelingen

Voorbeeld van drie boeken die vaak samen worden gekocht.

Meest populaire items vinden

book_df-DataFrame:

Gebruiker	Boek
User_233	The Great Gatsby
User_651	The Catcher in the Rye
User_131	The Lord of the Rings
User_965	Little Women
User_651	Fifty Shades of Grey
...	...

Meest populaire items vinden

book_df['book'].value_counts()

40 Shades of Grey                      524
Harry Potter and the Sorcerer's Stone  487
The Da Vinci Code                      455
The Twilight Saga                      401
Lord of the Rings                      278
                                     ...

Meest populaire items vinden

print(book_df.value_counts().index)

Index(['40 Shades of Grey', 'Harry Potter and the Sorcerer's Stone',
       'The Da Vinci Code', 'The Twilight Saga',
       'The Lord of the Rings'],
      dtype='object')

Meest geliefde items vinden

user_ratings-DataFrame:

Gebruiker	Boek	Beoordeling
User_233	The Great Gatsby	3.0
User_651	The Catcher in the Rye	5.0
User_131	The Lord of the Rings	3.0
User_965	Little Women	4.0
User_651	Fifty Shades of Grey	2.0
...	...	...

Meest geliefde items vinden

avg_rating_df = user_ratings[["book", "rating"]].groupby(['book']).mean()
avg_rating_df.head()

                                      rating
title                                      
Hamlet                                   4.1
The Da Vinci Code                       2.1
Gone with the Wind                       4.2
Fifty Shades of Grey                     1.2
Wuthering Heights                        3.9
                                          ...

Meest geliefde items vinden

sorted_avg_rating_df = avg_rating_df.sort_values(by="rating", ascending=False)
sorted_avg_rating_df.head()

                                      rating
title                                      
The Girl in the Fog                      5.0
Behind the Bell                          5.0
Across the River and into the Trees      5.0
The Complete McGonagall                  5.0
What Is to Be Done?                      5.0
                                          ...

Meest geliefde items vinden

(user_ratings['title']=='The Girl in the Fog').sum()

(user_ratings['title']=='Valley of the Dolls').sum()

(user_ratings['title']=='Across the River and into the Trees').sum()

Meest geliefde populaire items vinden

book_frequency = user_ratings["book"].value_counts()
print(book_frequency)

40 Shades of Grey                      524
Harry Potter and the Sorcerer's Stone  487
                                       ...

frequently_reviewed_books = book_frequency[book_frequency > 100].index
print(frequently_reviewed_books)

Index([u'The Lord of the Rings', u'To Kill a Mockingbird', u'Of Mice and Men',
       u'1984', u'Hamlet'])

Meest geliefde populaire items vinden

frequent_books_df =  user_ratings_df[user_ratings_df["book"].isin(frequently_reviewed_books)]

frequent_books_avgs = frequently_reviewed_books[["title", "rating"]].groupby('title').mean()
print(frequent_books_avgs.sort_values(by="rating", ascending=False).head())

                                      rating
title                                      
To Kill a Mockingbird                    4.7
1984.                                    4.7
Harry Potter and the Sorcerer's Stone    4.6
                                          ...

Laten we oefenen!

Aanbevelingssystemen bouwen in Python