Analisis Market Basket dengan Python
Isaiah Hull
Visiting Associate Professor of Finance, BI Norwegian Business School
import pandas as pd
# Load transactions from pandas.
books = pd.read_csv("datasets/bookstore.csv")
# Split transaction strings into lists.
transactions = books['Transaction'].apply(lambda t: t.split(','))
# Convert DataFrame into list of strings.
transactions = list(transactions)
print(transactions[:5])
[['language', 'travel', 'humor', 'fiction'],
['humor', 'language'],
['humor', 'biography', 'cooking'],
['cooking', 'language'],
['travel']]
Aturan asosiasi
Aturan multi-anteseden
Aturan multi-konsekuen
Menemukan aturan yang berguna itu sulit.
Bagaimana jika kita batasi ke aturan sederhana?
| Aturan Fiksi | Aturan Puisi | ... | Aturan Humor |
|---|---|---|---|
| fiksi->puisi | puisi->fiksi | ... | humor->fiksi |
| fiksi->sejarah | puisi->sejarah | ... | humor->sejarah |
| fiksi->biografi | puisi->biografi | ... | humor->biografi |
| fiksi->memasak | puisi->memasak | ... | humor->memasak |
| ... | ... | ... | ... |
| fiksi->humor | puisi->humor | ... |
from itertools import permutations
# Extract unique items.
flattened = [item for transaction in transactions for item in transaction]
items = list(set(flattened))
# Compute and print rules.
rules = list(permutations(items, 2))
print(rules)
[('fiction', 'poetry'),
('fiction', 'history'),
...
('humor', 'travel'),
('humor', 'language')]
# Print the number of rules
print(len(rules))
72

# Import the association rules function
from mlxtend.frequent_patterns import association_rules
from mlxtend.frequent_patterns import apriori
# Compute frequent itemsets using the Apriori algorithm
frequent_itemsets = apriori(onehot, min_support = 0.001,
max_len = 2, use_colnames = True)
# Compute all association rules for frequent_itemsets
rules = association_rules(frequent_itemsets,
metric = "lift",
min_threshold = 1.0)
Analisis Market Basket dengan Python