Social network based inference

Fraud Detection in R

Tim Verdonck

Professor Data Science at KU Leuven

Social network based inference

Goal

Predict the behavior of a node based on the behavior of other nodes

missing_nodes_in_network.png

Fraud Detection in R

Social network based inference

Challenges

  • Data are not independent
    • Behavior of one node might influence behavior of other nodes
    • Correlated behavior between nodes
  • Collective inference: inferences about nodes can affect each other

missing_nodes_in_network.png

Fraud Detection in R

Non-relational vs relational

Non-relational model

  • Only uses local information
  • Logistic regression, decision trees, ...

logistic_regression.png

Relational model

  • Makes use of links in the network
  • Relational neighbor classifier

simple_network.png

Fraud Detection in R

Relational neighbor classifier

Assumptions

  • Homophily: connected nodes have a propensity to belong to the same class ("guilt by association")
  • Some class labels are known

missing_node.png

Fraud Detection in R

Relational neighbor classifier

Probability of fraud

$$P(F | ?) = \frac{1 + 1}{1 + 1 + 1 + 1 + 1}=\frac{2}{5}= 40\%$$

missing_node.png

Fraud Detection in R

Relational neighbor classifier with weights

Probability of fraud

$$P(F | ?) = \frac{1 + 2}{3 + 1 + 1 + 2 + 1}=\frac{3}{8}=37.5\%$$

missing_node_weighted.png

Fraud Detection in R

Relational neighbor classifier

vertex_attr(network) ## Nodes are labeled as 1 (fraud), 0 (not fraud), or NA (unknown)
$name
"?" "B" "C" "D" "E" "A"
$isFraud
NA  1  0  1  0  0
edge_attr(network) ## The edges have a weight

$weight
2 3 1 1 1
Fraud Detection in R

Relational neighbor classifier

## subgraph(): create subgraph containing nodes "?" and all fraudulent nodes
subnetwork <- subgraph(network, v = c("?", "B", "D"))

## strength(): sum up the edge weights of the adjacent edges for node "?" prob_fraud <- strength(subnetwork, v = "?") / strength(network, v = "?")
prob_fraud
0.375
Fraud Detection in R

Let's practice!

Fraud Detection in R

Preparing Video For Download...