Intermediate Predictive Analytics in Python
Nele Verbiest
Senior Data Scientist @PythonPredictions
from scipy.stats.mstats import winsorize
basetable["variable_winsorized"] =
winsorize(
basetable["variable"],
limits = [0.05,0.01])
mean_age = basetable["age"].mean()
sd_age = basetable["age"].std()
lower_limit = mean_age - 3*sd_age
upper_limit = mean_age + 3*sd_age
basetable["age_no_outliers"] = pd.Series(
[min(max(a,lower_limit), upper_limit)
for a in basetable["age"]]
)
Intermediate Predictive Analytics in Python