Visualizing time-series imputations

Dealing with Missing Data in Python

Suraj Donthi

Deep Learning & Computer Vision Learning

Air quality time-series plot

airquality['Ozone'].plot(title='Ozone', marker='o', figsize=(30, 5))

Time series plot of air quality dataset

Dealing with Missing Data in Python

Ffill Imputation

ffill_imp['Ozone'].plot(color='red', marker='o', linestyle='dotted', figsize=(30, 5))

airquality['Ozone'].plot(title='Ozone', marker='o')

Time series plot of forward filled air quality dataset

Dealing with Missing Data in Python

Bfill Imputation

bfill_imp['Ozone'].plot(color='red', marker='o', linestyle='dotted', figsize=(30, 5))
airquality['Ozone'].plot(title='Ozone', marker='o')

Time series plot of backward filled air quality dataset

Dealing with Missing Data in Python

Linear Interpolation

linear_interp['Ozone'].plot(color='red', marker='o', linestyle='dotted', figsize=(30, 5))
airquality['Ozone'].plot(title='Ozone', marker='o')

Time series plot of linear interpolated air quality dataset

Dealing with Missing Data in Python

Quadratic Interpolation

quadratic_interp['Ozone'].plot(color='red', marker='o', linestyle='dotted', figsize=(30, 5))
airquality['Ozone'].plot(title='Ozone', marker='o')

Time series plot of quadratic interpolated air quality dataset

Dealing with Missing Data in Python

Nearest Interpolation

nearest_interp['Ozone'].plot(color='red', marker='o', linestyle='dotted', figsize=(30, 5))
airquality['Ozone'].plot(title='Ozone', marker='o')

Time series plot of nearest interpolated air quality dataset

Dealing with Missing Data in Python

A comparison of the interpolations

# Create subplots
fig, axes = plt.subplots(3, 1, figsize=(30, 20))

# Create interpolations dictionary
interpolations = {'Linear Interpolation': linear_interp, 
                         'Quadratic Interpolation': quadratic_interp, 
                         'Nearest Interpolation': nearest_interp}

# Visualize each interpolation
for ax, df_key in zip(axes, interpolations):
            interpolations[df_key].Ozone.plot(color='red', marker='o', 
                                              linestyle='dotted', ax=ax)
            airquality.Ozone.plot(title=df_key + ' - Ozone', marker='o', ax=ax)
Dealing with Missing Data in Python

A comparison of the interpolations

Comparison of the interpolated air quality dataframes

Dealing with Missing Data in Python

A comparison of imputation techniques

Comparison of all the imputed air quality datasets

Dealing with Missing Data in Python

Summary

  • Time-series plot of imputed DataFrame
  • Comparison of imputations
Dealing with Missing Data in Python

Let's practice!

Dealing with Missing Data in Python

Preparing Video For Download...