Anomaly Detection in Python
Bekhruz (Bex) Tuychiev
Kaggle Master, Data Science Content Creator
Hyperparameters which influence IForest the most:
contaminationn_estimatorsmax_samplesmax_featuresHow IForest classifies data points:
contaminationcontamination are chosen as outlying datapointsfrom pyod.models.iforest import IForest# Accepts a value between 0 and 0.5 iforest = IForest(contamination=0.05)
# More trees for larger datasets
iforest = IForest(n_estimators=1000)
iforest.fit(airbnb_df)
iforest = IForest(n_estimators=200, max_samples=0.6, max_features=0.9)iforest.fit(airbnb_df)
Anomaly Detection in Python