Prepare data for machine learning

Analyzing IoT Data in Python

Matthias Voppichler

IT Developer

Machine Learning Refresher

  • Supervised learning
    • Classification
    • Regression
  • Unsupervised learning
    • Cluster analysis
  • Deep learning
    • Neural networks
Analyzing IoT Data in Python

Machine Learning Refresher

  • Supervised learning
    • Classification
    • Regression
  • Unsupervised learning
    • Cluster analysis
  • Deep learning
    • Neural networks
Analyzing IoT Data in Python

Labels

print(environment_labeled.head())
                     humidity  temperature  pressure   label
timestamp                                                   
2018-10-01 00:00:00      81.0         11.8    1013.4       1
2018-10-01 00:15:00      79.7         11.9    1013.1       1
2018-10-01 00:30:00      81.0         12.1    1013.0       1
2018-10-01 00:45:00      79.7         11.7    1012.7       1
2018-10-01 01:00:00      84.3         11.2    1012.6       1
Analyzing IoT Data in Python

Train / Test split

Splitting time series data

  • Model should not see test-data during training
  • Cannot use random split
  • Model should not be allowed to look into the future
Analyzing IoT Data in Python

Train / test split

split_day = "2018-10-13"

train = environment[:split_day] test = environment[split_day:]
print(train.iloc[0].name) print(train.iloc[-1].name) print(test.iloc[0].name) print(test.iloc[-1].name)
2018-10-01 00:00:00
2018-10-13 23:45:00
2018-10-14 00:00:00
2018-10-15 23:45:00

C4_L3_train_test_split.png

Analyzing IoT Data in Python

Features and Labels

X_train = train.drop("target", axis=1)
y_train = train["target"]
X_test = test.drop("target", axis=1)
y_test = test["target"]

print(X_train.shape) print(y_train.shape)
(1248, 3)
(1248,)
Analyzing IoT Data in Python

Logistic Regression

from sklearn.linear_model import LogisticRegression

logreg = LogisticRegression()
logreg.fit(X_train, y_train)
print(logreg.predict(X_test))
[0 0 1 1 1 1 1 0 0]
Analyzing IoT Data in Python

Let's practice!

Analyzing IoT Data in Python

Preparing Video For Download...