Analisi del carrello in Python
Isaiah Hull
Visiting Associate Professor of Finance, BI Norwegian Business School
import pandas as pd
# Load transactions from pandas.
books = pd.read_csv("datasets/bookstore.csv")
# Split transaction strings into lists.
transactions = books['Transaction'].apply(lambda t: t.split(','))
# Convert DataFrame into list of strings.
transactions = list(transactions)
print(transactions[:5])
[['language', 'travel', 'humor', 'fiction'],
['humor', 'language'],
['humor', 'biography', 'cooking'],
['cooking', 'language'],
['travel']]
Regola di associazione
Regola con più antecedenti
Regola con più conseguenti
Trovare buone regole è difficile.
E se ci limitassimo a regole semplici?
| Regole narrativa | Regole poesia | ... | Regole umorismo |
|---|---|---|---|
| narrativa->poesia | poesia->narrativa | ... | umorismo->narrativa |
| narrativa->storia | poesia->storia | ... | umorismo->storia |
| narrativa->biografia | poesia->biografia | ... | umorismo->biografia |
| narrativa->cucina | poesia->cucina | ... | umorismo->cucina |
| ... | ... | ... | ... |
| narrativa->umorismo | poesia->umorismo | ... |
from itertools import permutations
# Extract unique items.
flattened = [item for transaction in transactions for item in transaction]
items = list(set(flattened))
# Compute and print rules.
rules = list(permutations(items, 2))
print(rules)
[('fiction', 'poetry'),
('fiction', 'history'),
...
('humor', 'travel'),
('humor', 'language')]
# Print the number of rules
print(len(rules))
72

# Import the association rules function
from mlxtend.frequent_patterns import association_rules
from mlxtend.frequent_patterns import apriori
# Compute frequent itemsets using the Apriori algorithm
frequent_itemsets = apriori(onehot, min_support = 0.001,
max_len = 2, use_colnames = True)
# Compute all association rules for frequent_itemsets
rules = association_rules(frequent_itemsets,
metric = "lift",
min_threshold = 1.0)
Analisi del carrello in Python