“If this then that” with the apriori

Market Basket Analysis in R

Christopher Bruffaerts

Statistician

Recap of extracted rules (1)

TID Transaction
1 {Bread, Butter, Cheese, Wine}
2 {Bread, Butter, Wine}
3 {Bread, Butter}
4 {Butter, Cheese, Wine}
5 {Butter, Cheese}
6 {Cheese, Wine}
7 {Butter, Wine}

Apply apriori on transactions:

rules = apriori(data_trx,
                    parameter = list(
                      supp = 3/7, conf = 0.6,
                      minlen = 2),
                    control = list(verbose=F)
)
Market Basket Analysis in R

Recap of extracted rules (2)

Create dataframe with extracted rules

df_rules = as(rules, "data.frame")
df_rules
                 rules   support confidence      lift count
1  {Bread} => {Butter} 0.4285714  1.0000000 1.1666667     3
2   {Cheese} => {Wine} 0.4285714  0.7500000 1.0500000     3
3   {Wine} => {Cheese} 0.4285714  0.6000000 1.0500000     3
4 {Cheese} => {Butter} 0.4285714  0.7500000 0.8750000     3
5   {Wine} => {Butter} 0.5714286  0.8000000 0.9333333     4
6   {Butter} => {Wine} 0.5714286  0.6666667 0.9333333     4
Market Basket Analysis in R

Appearance of frequent itemsets

Frequent itemsets for Cheese and Wine

supp_cheese_wine = 
    apriori(trans, 
        parameter = list(
          target = "frequent itemsets",
          supp = 3/7),
        appearance = list(
          items = c("Cheese",  "Wine"))
)
inspect(supp_cheese_wine)
    items         support   count
[1] {Cheese}      0.5714286 4    
[2] {Wine}        0.7142857 5    
[3] {Cheese,Wine} 0.4285714 3
Market Basket Analysis in R

Appearance of extracted rules

Specific rules for Cheese

rules_cheese_rhs = apriori(data = trans, 
                   parameter = list(supp=3/7,conf=0.2, minlen=2),
                   appearance = list(rhs="Cheese"),
                   control = list (verbose=F))
inspect(rules_cheese_rhs)
    lhs         rhs      support   confidence lift  count
[1] {Wine}   => {Cheese} 0.4285714 0.6        1.050 3    
[2] {Butter} => {Cheese} 0.4285714 0.5        0.875 3
Market Basket Analysis in R

Redundant rules

What is a redundant rule?

A rule is redundant if a more general rule with the same or a higher confidence exists.

Super-rule:

A rule is more general if it has the same RHS but one or more items removed from the LHS.

Example:

Super-rules of {A} $\rightarrow$ {C}:

  • {A, B} $\rightarrow$ {C}
  • {A, B, D} $\rightarrow$ {C}

Non-redundant rules are defined as:

  • all other rules are super-rules of that rule
  • all other rules have a lower confidence
Market Basket Analysis in R

Rule redundancy (1)

Set of generated rules

rules = apriori(trans,control = list(verbose=F),
                parameter = list(supp=0.05, conf=0.5, minlen=2),
                appearance = list(rhs="Bread", default = "lhs"))

Set of pruned rules (non-redundant)

redundant_rules = is.redundant(rules)
non_redundant_rules = rules[!redundant_rules]
Market Basket Analysis in R

Rule redundancy (2)

Comparing extracted rules and non-redundant rules

inspect(rules) 
    lhs                     rhs     support   confidence lift     count
[1] {Butter}             => {Bread} 0.4285714 0.5        1.166667 3    
[2] {Butter,Wine}        => {Bread} 0.2857143 0.5        1.166667 2    
[3] {Butter,Cheese,Wine} => {Bread} 0.1428571 0.5        1.166667 1  
inspect(non_redundant_rules)
    lhs         rhs     support   confidence lift     count
[1] {Butter} => {Bread} 0.4285714 0.5        1.166667 3
Market Basket Analysis in R

Let's follow the rules!

Market Basket Analysis in R

Preparing Video For Download...