Rekayasa Fitur untuk Machine Learning di Python
Robet O'Callaghan
Director of Data Science, Ordergroove
scaler = StandardScaler()
scaler.fit(train[['col']])
train['scaled_col'] = scaler.transform(train[['col']])
# FIT SOME MODEL
# ....
test = pd.read_csv('test_csv')
test['scaled_col'] = scaler.transform(test[['col']])
train_mean = train[['col']].mean()
train_std = train[['col']].std()
cut_off = train_std * 3
train_lower = train_mean - cut_off
train_upper = train_mean + cut_off
# Subset train data
test = pd.read_csv('test_csv')
# Subset test data
test = test[(test[['col']] < train_upper) &
(test[['col']] > train_lower)]
Kebocoran data: Menggunakan data yang tidak akan tersedia saat menilai kinerja model Anda
Rekayasa Fitur untuk Machine Learning di Python