Analisis Keranjang Belanja di R
Christopher Bruffaerts
Statistician
| TID | Transaksi |
|---|---|
| 1 | {Bread, Butter, Cheese, Wine} |
| 2 | {Bread, Butter, Wine} |
| 3 | {Bread, Butter} |
| 4 | {Butter, Cheese, Wine} |
| 5 | {Butter, Cheese} |
| 6 | {Cheese, Wine} |
| 7 | {Butter, Wine} |
Terapkan apriori pada transaksi:
rules = apriori(data_trx,
parameter = list(
supp = 3/7, conf = 0.6,
minlen = 2),
control = list(verbose=F)
)
Buat dataframe dari aturan yang diekstrak
df_rules = as(rules, "data.frame")
df_rules
rules support confidence lift count
1 {Bread} => {Butter} 0.4285714 1.0000000 1.1666667 3
2 {Cheese} => {Wine} 0.4285714 0.7500000 1.0500000 3
3 {Wine} => {Cheese} 0.4285714 0.6000000 1.0500000 3
4 {Cheese} => {Butter} 0.4285714 0.7500000 0.8750000 3
5 {Wine} => {Butter} 0.5714286 0.8000000 0.9333333 4
6 {Butter} => {Wine} 0.5714286 0.6666667 0.9333333 4
Itemset sering untuk Cheese dan Wine
supp_cheese_wine =
apriori(trans,
parameter = list(
target = "frequent itemsets",
supp = 3/7),
appearance = list(
items = c("Cheese", "Wine"))
)
inspect(supp_cheese_wine)
items support count
[1] {Cheese} 0.5714286 4
[2] {Wine} 0.7142857 5
[3] {Cheese,Wine} 0.4285714 3
Aturan spesifik untuk Cheese
rules_cheese_rhs = apriori(data = trans,
parameter = list(supp=3/7,conf=0.2, minlen=2),
appearance = list(rhs="Cheese"),
control = list (verbose=F))
inspect(rules_cheese_rhs)
lhs rhs support confidence lift count
[1] {Wine} => {Cheese} 0.4285714 0.6 1.050 3
[2] {Butter} => {Cheese} 0.4285714 0.5 0.875 3
Apa itu aturan redundan?
Sebuah aturan redundan jika ada aturan yang lebih umum dengan confidence sama atau lebih tinggi.
Super-rule:
Aturan lebih umum jika RHS sama tetapi satu atau lebih item di LHS dihapus.
Contoh:
Super-rule dari {A} $\rightarrow$ {C}:
Aturan non-redundan didefinisikan sebagai:
Kumpulan aturan yang dihasilkan
rules = apriori(trans,control = list(verbose=F),
parameter = list(supp=0.05, conf=0.5, minlen=2),
appearance = list(rhs="Bread", default = "lhs"))
Kumpulan aturan yang dipangkas (non-redundan)
redundant_rules = is.redundant(rules)
non_redundant_rules = rules[!redundant_rules]
Membandingkan aturan diekstrak vs non-redundan
inspect(rules)
lhs rhs support confidence lift count
[1] {Butter} => {Bread} 0.4285714 0.5 1.166667 3
[2] {Butter,Wine} => {Bread} 0.2857143 0.5 1.166667 2
[3] {Butter,Cheese,Wine} => {Bread} 0.1428571 0.5 1.166667 1
inspect(non_redundant_rules)
lhs rhs support confidence lift count
[1] {Butter} => {Bread} 0.4285714 0.5 1.166667 3
Analisis Keranjang Belanja di R