Anomaly Detection in Python
Bekhruz (Bex) Tuychiev
Kaggle Master, Data Science Content Creator
Hyperparameters which influence IForest
the most:
contamination
n_estimators
max_samples
max_features
How IForest
classifies data points:
contamination
contamination
are chosen as outlying datapointsfrom pyod.models.iforest import IForest
# Accepts a value between 0 and 0.5 iforest = IForest(contamination=0.05)
# More trees for larger datasets
iforest = IForest(n_estimators=1000)
iforest.fit(airbnb_df)
iforest = IForest(n_estimators=200, max_samples=0.6, max_features=0.9)
iforest.fit(airbnb_df)
Anomaly Detection in Python