Handling missingness

Case Study: Analyzing City Time Series Data in R

Lore Dirick

Manager of Data Science Curriculum at Flatiron School

Missingness

citydata
              pop
1980-01-01 562994
1981-01-01 564179
1982-01-01 565361
1983-01-01 565491
1984-01-01 566723
1985-01-01     NA
1986-01-01     NA
1987-01-01     NA
1988-01-01 570867
1989-01-01 572222
1990-01-01 574823

time series data plotted with missing data

Case Study: Analyzing City Time Series Data in R

Fill NAs with last observation

  • Last observation carried forward (LOCF)
citydata_locf <- na.locf(citydata)

plot.xts(citydata)
plot.xts(citydata_locf)

time series data plotted with missing data filled with last observation carried forward

Case Study: Analyzing City Time Series Data in R

Fill NAs with next observation

  • Next observation carried backward (NOCB)
citydata_nocb <- na.locf(citydata, fromLast = TRUE)

plot.xts(citydata)
plot.xts(citydata_nocb)

time series data plotted with missing data filled with next observation carried backward

Case Study: Analyzing City Time Series Data in R

Linear interpolation

citydata_approx <- na.approx(citydata)

plot.xts(citydata)
plot.xts(citydata_nocb)

time series data plotted with missing data filled with linear interpolation

Case Study: Analyzing City Time Series Data in R

Let's practice!

Case Study: Analyzing City Time Series Data in R

Preparing Video For Download...