What is market basket analysis?

Market Basket Analysis in Python

Isaiah Hull

Visiting Associate Professor of Finance, BI Norwegian Business School

Selecting a bookstore layout

 

This image shows a bookstore layout where the fiction and biography sections are grouped, and the poetry and history sections are grouped.

 

This image shows a bookstore layout where the fiction and poetry sections are grouped, and the biography and history sections are grouped.

Market Basket Analysis in Python

Exploring transaction data

TID Transaction
1 biography, history
2 fiction
3 biography, poetry
4 fiction, history
5 biography
... ...
75000 fiction, poetry

 

  • TID = unique ID associated with each transaction.

 

  • Transaction = set of unique items purchased together.
Market Basket Analysis in Python

What is market basket analysis?

  1. Identify products frequently purchased together.

    • Biography and history
    • Fiction and poetry
  2. Construct recommendations based on these findings.

    • Place biography and history sections together.
    • Keep fiction and history apart.
Market Basket Analysis in Python

The use cases of market basket analysis

  1. Build Netflix-style recommendations engine.
  2. Improve product recommendations on an e-commerce store.
  3. Cross-sell products in a retail setting.
  4. Improve inventory management.
  5. Upsell products.
Market Basket Analysis in Python

Using market basket analysis

TID Transaction
11 fiction, biography
12 fiction, biography
13 history, biography
... ...
19 fiction, biography
20 fiction, biography
... ...
  • Market basket analysis
    • Construct association rules
    • Identify items frequently purchased together
  • Association rules
    • {antecedent} $\rightarrow$ {consequent}
      • {fiction} $\rightarrow$ {biography}
Market Basket Analysis in Python

Loading the data

import pandas as pd

# Load transactions from pandas.
books = pd.read_csv("datasets/bookstore.csv")

# Print the header
print(books.head(2))
TID         Transaction                 
0    biography, history
1               fiction

For a refresher, see the Pandas Cheat Sheet.

Market Basket Analysis in Python

Building transactions

 

# Split transaction strings into lists.
transactions = books['Transaction'].apply(lambda t: t.split(','))

# Convert DataFrame into list of strings.
transactions = list(transactions)
Market Basket Analysis in Python

Counting the itemsets

# Print the first transaction.
print(transactions[0])
['biography', 'history']
# Count the number of transactions that contain biography and fiction.
transactions.count(['biography', 'fiction'])
218
Market Basket Analysis in Python

Making a recommendation

# Count the number of transactions that contain fiction and poetry.
transactions.count(['fiction', 'poetry'])
5357

This image shows a bookstore layout where the fiction and poetry sections are grouped, and the biography and history sections are grouped.

Market Basket Analysis in Python

Let's practice!

Market Basket Analysis in Python

Preparing Video For Download...