Understand the data

Analyzing IoT Data in Python

Matthias Voppichler

IT Developer

Store data to disk

Reasons to store IoT Data

  • Limited historical data availability
  • Reproducible results
  • Training ML Models
Analyzing IoT Data in Python

Store data using pandas

df_env.to_json("data.json", orient="records")
!cat data.json

[{'timestamp': 1536924000000, 'value': 22.3}, {'timestamp': 1536924600000, 'value': 22.8}, {'timestamp': 1536925200000, 'value': 23.3}, {'timestamp': 1536925800000, 'value': 23.6}, {'timestamp': 1536926400000, 'value': 23.5}]
Analyzing IoT Data in Python

Reading stored data

  • From JSON files
import pandas as pd
df_env = pd.read_json("data.json")
  • From CSV file
import pandas as pd
df_env = pd.read_csv("data.csv")
Analyzing IoT Data in Python

Validate data load

  • Correct column headers
  • Check Data formats
df_env.head()
            timestamp  humidity  pressure  sunshine  temperature
0 2018-09-01 00:00:00      95.6    1016.3     599.2         16.1
2 2018-09-01 00:10:00      95.5    1016.4     600.0         16.1
4 2018-09-01 00:20:00      95.2    1016.5     598.9         16.1
6 2018-09-01 00:30:00      95.1    1016.4     600.0         16.1
8 2018-09-01 00:40:00      95.3    1016.3     600.0         16.1
Analyzing IoT Data in Python

DataFrame.info()

df_env.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13085 entries, 0 to 13085
Data columns (total 5 columns):
pressure                13085 non-null float64
humidity                13085 non-null float64
sunshine                13083 non-null float64
temperature             13059 non-null float64
timestamp               13085 non-null datetime64[ns]

dtypes: datetime64[ns](1), float64(6)
memory usage: 1.4 MB
Analyzing IoT Data in Python

pandas describe()

df_env.describe()
           humidity      pressure      sunshine  temperature
count  13057.000000  13057.000000  13057.000000  13057.00000
mean      73.748350   1019.173003    187.794746     14.06647
std       20.233558      6.708031    274.094951      6.61272
min        8.900000    989.500000      0.000000     -1.80000
25%       57.500000   1016.000000      0.000000      9.80000
50%       78.800000   1019.700000      0.000000     13.40000
75%       91.300000   1023.300000    598.900000     18.90000
max      100.100000   1039.800000    600.000000     30.40000
Analyzing IoT Data in Python

Time for Practice!

Analyzing IoT Data in Python

Preparing Video For Download...