Scalare i dati per il machine learning

Analizzare i dati IoT in Python

Matthias Voppichler

IT Developer

Valuta il modello

logreg = LogisticRegression()
logreg.fit(X_train, y_train)

print(logreg.score(X_test, y_test))
0.78145113
Analizzare i dati IoT in Python

Scaling

StandardScaler di scikit-learn
  • rimuove la media
  • scala alla varianza

Immagine che spiega lo scaling: le feature sono ora centrate su 0

Analizzare i dati IoT in Python

Dati non scalati

print(data.head())
                     humidity  temperature  pressure
timestamp                                           
2018-10-01 00:00:00      81.0         11.8    1013.4
2018-10-01 00:15:00      79.7         11.9    1013.1
2018-10-01 00:30:00      81.0         12.1    1013.0
2018-10-01 00:45:00      79.7         11.7    1012.7
2018-10-01 01:00:00      84.3         11.2    1012.6
Analizzare i dati IoT in Python

StandardScaler

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
sc.fit(data)
print(sc.mean_) print(sc.var_)
[  71.8826716    14.17002019 1018.17042396]
[372.78261022  20.37926608  53.67519188]
data_scaled = sc.transform(data)
Analizzare i dati IoT in Python

StandardScaler

df_scaled = pd.DataFrame(data_scaled, 
                         columns=data.columns, 
                         index=data.index)
print(data_scaled.head())
                     humidity  temperature  pressure
timestamp                                           
2018-10-01 00:00:00  0.472215    -0.524998 -0.651134
2018-10-01 00:15:00  0.404884    -0.502847 -0.692082
2018-10-01 00:30:00  0.472215    -0.458543 -0.705731
2018-10-01 00:45:00  0.404884    -0.547150 -0.746679
2018-10-01 01:00:00  0.643132    -0.657908 -0.760329
Analizzare i dati IoT in Python

Valuta il modello

logreg = LogisticRegression()
logreg.fit(X_train_scaled, y_train_scaled)

print(logreg.score(X_test_scaled, y_test_scaled))
0.88145113
Analizzare i dati IoT in Python

Ayo berlatih!

Analizzare i dati IoT in Python

Preparing Video For Download...