Fraud Detection in Python
Charlotte Werger
Data Scientist
Rules based systems have their limitations:
from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split from sklearn import metrics
# Step 1: split your features and labels into train and test data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Step 2: Define which model you want to use model = LinearRegression()
# Step 3: Fit the model to your training data model.fit(X_train, y_train)
# Step 4: Obtain model predictions from your test data y_predicted = model.predict(X_test)
# Step 5: Compare y_test to predictions and obtain performance metrics print (metrics.r2_score(y_test, y_predicted))
0.821206237313
Chapter 2. Supervised learning: train a model using existing fraud labels
Chapter 3. Unsupervised learning: use your data to determine what is 'suspicious' behavior without labels
Chapter 4. Fraud detection using text data: Learn how to augment your fraud detection models with text mining and topic modeling
Fraud Detection in Python