Analisis Keranjang Belanja di R
Christopher Bruffaerts
Statistician
Analisis keranjang belanja
Fokus pada apa, bukan berapa banyak;
yaitu apa saja yang ada di keranjang pelanggan?

Metrik utama
Catatan kehati-hatian
Kumpulan aturan yang diekstrak bisa sangat besar.
Jangan meninjau atau menampilkan semua aturan; selalu gunakan subset atau fungsi head/tail.
Kembali ke Toko Kelontong

Dataset dari paket arules
# Loading the arules package
library(arules)
# Loading the Groceries dataset
data(Groceries)
summary(Groceries)
transactions as itemMatrix in sparse format with
9835 rows (elements/itemsets/transactions) and
169 columns (items) and a density of 0.02609146
most frequent items:
whole milk other vegetables rolls/buns soda yogurt
2513 1903 1809 1715 1372
(Other)
34055
element (itemset/transaction) length distribution:
sizes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
2159 1643 1299 1005 855 645 545 438 350 246 182 117 78 77 55 46 29
18 19 20 21 22 23 24 26 27 28 29 32
14 14 9 11 4 6 1 1 1 1 3 1
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 2.000 3.000 4.409 6.000 32.000
includes extended item information - examples:
labels level2 level1
1 frankfurter sausage meat and sausage
2 sausage sausage meat and sausage
3 liver loaf sausage meat and sausage
# Plotting a sample of 200 transactions
image(sample(Groceries, 200))

Item terpopuler
itemFrequencyPlot(Groceries,type="relative",
topN=10,horiz=TRUE,col='steelblue3')

Item paling tidak populer
par(mar=c(2,10,2,2), mfrow=c(1,1))
barplot(sort(table(unlist(LIST(Groceries))))[1:10],
horiz = TRUE,las = 1,col='orange')

Tabel kontingensi
# Contingency table
tbl = crossTable(Groceries)
tbl[1:4,1:4]
frankfurter sausage liver loaf ham
frankfurter 580 99 7 25
sausage 99 924 10 49
liver loaf 7 10 50 3
ham 25 49 3 256
Tabel kontingensi terurut
# Sorted contingency table
tbl = crossTable(Groceries, sort = TRUE)
tbl[1:4,1:4]
whole milk other vegetables rolls/buns soda
whole milk 2513 736 557 394
other vegetables 736 1903 419 322
rolls/buns 557 419 1809 377
soda 394 322 377 1715
Tabel kontingensi
# Counts
tbl['whole milk','flour']
[1] 83
# Uji chi-kuadrat
crossTable(Groceries, measure='chi')['whole milk', 'flour']
[1] 0.003595389
Tabel kontingensi dengan metrik lain
crossTable(Groceries, measure='lift',sort=T)[1:4,1:4]
whole milk other vegetables rolls/buns soda
whole milk NA 1.5136341 1.205032 1.571735
other vegetables 1.5136341 NA 1.197047 0.9703476
rolls/buns 1.2050318 1.1970465 NA 1.1951242
soda 0.8991124 0.9703476 1.195124 NA
MovieLens: Sistem rekomendasi berbasis web yang menyarankan film untuk ditonton pengguna.

Analisis Keranjang Belanja di R