Advanced Apriori results pruning

Market Basket Analysis in Python

Isaiah Hull

Visiting Associate Professor of Finance, BI Norwegian Business School

Applications

Cross-Promotion

This image shows an example of cross-promotion.

Aggregation

This image shows an example of aggregation.

Market Basket Analysis in Python

The Apriori algorithm

List of Lists

This image shows an example of a list of lists of the items in transactions.

One-Hot Encoding

This image shows the one-hot encoding of transactions.

Apriori Algorithm

This image illustrates the Apriori algorithm being applied to a generic set of 4 items.

Market Basket Analysis in Python

The Apriori algorithm

import pandas as pd
import numpy as np
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori
itemsets = np.load('itemsets.npy')
print(itemsets)
[['EASTER CRAFT 4 CHICKS'],
['CERAMIC CAKE DESIGN SPOTTED MUG', 'CHARLOTTE BAG APPLES DESIGN'],
['SET 12 COLOUR PENCILS DOLLY GIRL'],
...
['JUMBO BAG RED RETROSPOT', ... 'LIPSTICK PEN FUSCHIA']]
Market Basket Analysis in Python

The Apriori algorithm

# One-hot encode data
encoder = TransactionEncoder()
onehot = encoder.fit(itemsets).transform(itemsets)
onehot = pd.DataFrame(onehot, columns = encoder.columns_)
# Apply Apriori algorithm and print
frequent_itemsets = apriori(onehot, use_colnames=True, min_support=0.001)
print(frequent_itemsets)
      support                                           itemsets
0    0.001504                               ( DOLLY GIRL BEAKER)
1    0.002256                         ( RED SPOT GIFT BAG LARGE)
...
428  0.001504  (BIRTHDAY CARD, RETRO SPOT, JUMBO BAG RED RETR...
Market Basket Analysis in Python

Apriori algorithm results

print(len(data.columns))
4201
print(len(frequent_itemsets))
2328
rules = association_rules(frequent_itemsets)
Market Basket Analysis in Python

Association rules

print(rules['consequents'])
0                   (DOTCOM POSTAGE)
                        ... 
9                 (HERB MARKER THYME)
                        ...
234        (JUMBO BAG RED RETROSPOT)
235         (WOODLAND CHARLOTTE BAG)
236    (RED RETROSPOT CHARLOTTE BAG)
237       (STRAWBERRY CHARLOTTE BAG)
238      (CHARLOTTE BAG SUKI DESIGN)
Name: consequents, Length: 239, dtype: object
Market Basket Analysis in Python

Filtering with multiple metrics

targeted_rules = rules[rules['consequents'] == {'HERB MARKER THYME'}].copy()
filtered_rules = targeted_rules[(targeted_rules['antecedent support'] > 0.01) &
                        (targeted_rules['support'] > 0.009) &
                        (targeted_rules['confidence'] > 0.85) &
                        (targeted_rules['lift'] > 1.00)]
print(filtered_rules['antecedents'])
9        (HERB MARKER BASIL)
25     (HERB MARKER PARSLEY)
27    (HERB MARKER ROSEMARY)
Name: antecedents, dtype: object
Market Basket Analysis in Python

Grouping products

The image shows a store floorplan where boxes are grouped with bags and signs are grouped with candles.

The image shows a store floorplan where boxes are grouped with candles and signs are grouped with bags.

The image shows a store floorplan where boxes are grouped with signs and candles are grouped with bags.

Market Basket Analysis in Python

Aggregation and dissociation

# Load aggregated data
aggregated = pd.read_csv('datasets/online_retail_aggregated.csv')

# Compute frequent itemsets
onehot = encoder.fit(aggregated).transform(aggregated)
data = pd.DataFrame(onehot, columns = encoder.columns_)
frequent_itemsets = apriori(data, use_colnames=True)

# Compute standard metrics
rules = association_rules(frequent_itemsets)
# Compute Zhang's rule
rules['zhang'] = zhangs_rule(rules)
Market Basket Analysis in Python

Zhang's rule

# Print rules that indicate dissociation
print(rules[rules['zhang'] < 0][['antecedents','consequents']])
  antecedents consequents
2       (bag)    (candle)
3    (candle)       (bag)
4      (sign)       (bag)
5       (bag)      (sign)
Market Basket Analysis in Python

Selecting a floorplan

The image shows a store floorplan where boxes are grouped with bags and signs are grouped with candles.

The image shows a store floorplan where boxes are grouped with bags and signs are grouped with candles.

The image shows a store floorplan where boxes are grouped with bags and signs are grouped with candles.

Market Basket Analysis in Python

Let's practice!

Market Basket Analysis in Python

Preparing Video For Download...