Challenges of network-based inference

Analitik Prediktif menggunakan Data Berjejaring di R

María Óskarsdóttir, Ph.D.

Post-doctoral researcher

First challenge

Splitting the data!

set.seed(1001)
sampleVertices <- sample(1:10, 6, replace=FALSE)
plot(induced_subgraph(g, V(g)[sampleVertices]))
plot(induced_subgraph(g, V(g)[-sampleVertices]))

Splitting

Analitik Prediktif menggunakan Data Berjejaring di R

Second challenge

The observations in the dataset are not independent and identically distributed (iid)

IID

Analitik Prediktif menggunakan Data Berjejaring di R

Third challenge

Collective Inference!

IID

Analitik Prediktif menggunakan Data Berjejaring di R

Probabilistic relational neighbor classifier

# probability churn (C)
(0.9 + 0.2 + 0.1 + 0.4 + 0.8) / 5
0.48
# probability non-churn (NC)
(0.1 + 0.8 + 0.9 + 0.6 + 0.2) / 5
0.52
Analitik Prediktif menggunakan Data Berjejaring di R

Let's practice!

Analitik Prediktif menggunakan Data Berjejaring di R

Preparing Video For Download...