Niet-gepersonaliseerde suggesties

Aanbevelingssystemen bouwen in Python

Robert O'Callaghan

Director of Data

Paren identificeren

0     User_223           The Great Gatsby <---| Door dezelfde gebruiker gelezen
1     User_223     The Catcher in the Rye <---|
2     User_131        The Lord of the Rings
3     User_965               Little Women <---| Door dezelfde gebruiker gelezen
4     User_965       Fifty Shades of Grey <---|
... ...

Permutaties versus combinaties

User	book_title
User_233	The Great Gatsby
User_233	The Catcher in the Rye

Naar:

	Boek A	Boek B
0	The Great Gatsby	The Catcher in the Rye
1	The Catcher in the Rye	The Great Gatsby

Boeken gezien met The Great Gatsby -> The Catcher in the Rye

Boeken gezien met The Catcher in the Rye -> The Great Gatsby

De koppelingsfunctie maken

from itertools import permutations

def create_pairs(x):


  return pairs

De koppelingsfunctie maken

from itertools import permutations

def create_pairs(x):
  pairs =                   permutations(x.values, 2)

  return pairs

permutations(list, length_of_permutations)) Maakt een iterabel met alle permutaties

De koppelingsfunctie maken

from itertools import permutations

def create_pairs(x):
  pairs =              list(permutations(x.values, 2))

  return pairs

permutations(list, length_of_permutations)) Maakt een iterabel met alle permutaties
list() Zet dit om naar een bruikbare lijst

De koppelingsfunctie maken

from itertools import permutations

def create_pairs(x):
  pairs = pd.DataFrame(list(permutations(x.values, 2)), 
                           columns=['book_a','book_b'])
  return pairs

permutations(list, length_of_permutations)) Maakt een iterabel met alle permutaties
list() Zet dit om naar een bruikbare lijst
pd.DataFrame() Zet de lijst om naar een DataFrame met kolommen book_a en book_b

De functie op de data toepassen

book_pairs = book_df.groupby('userId')['book_title'].apply(perm_function)
print(book_pairs.head())

                                book_a                                   book_b
userId                                                   
User_223     0        The Great Gatsby                   The Catcher in the Rye
             1  The Catcher in the Rye                         The Great Gatsby
User_965     0            Little Women                        40 Shades of Grey
             1       40 Shades of Grey                             Little Women
User_773     0       The Twilight Saga    Harry Potter and the Sorcerer's Stone
                                                                            ...

Resultaten opschonen

book_pairs = book_pairs.reset_index(drop=True)
print(book_pairs.head())

                     book_a                                   book_b
0          The Great Gatsby                   The Catcher in the Rye
1    The Catcher in the Rye                         The Great Gatsby
3              Little Women                        40 Shades of Grey
4         40 Shades of Grey                             Little Women
5         The Twilight Saga    Harry Potter and the Sorcerer's Stone
                                                                 ...

Koppelingen tellen

pair_counts = book_pairs.groupby(['book_a', 'book_b']).size()

book_a                                book_b                             
The Twilight Saga                     Fifty Shades of Grey           16
                                      Pride and Prejudice            12
                                                                    ...

pair_counts_df = pair_counts.to_frame(name = 'size').reset_index()
print(pair_counts_df.head())

     book_a                                book_b                       size    
1    The Twilight Saga                     Fifty Shades of Grey           16
2    The Twilight Saga                     Pride and Prejudice            12
                                                                         ...

Aanbevelingen opzoeken

pair_counts_sorted = pair_counts_df.sort_values('size', ascending=False)

pair_counts_sorted[pair_counts_sorted['book_a'] == 'Lord of the Rings']

                  book_a                                     book_b size
137    Lord of the Rings                                 The Hobbit   12
147    Lord of the Rings      Harry Potter and the Sorcerer's Stone   10
143    Lord of the Rings                        The Colour of Magic    9
                                                                     ...

Laten we oefenen!

Aanbevelingssystemen bouwen in Python