Filter missing values

Intermediate SQL with AI

Jasmin Ludolf

Senior Data Science Content Developer

Missing values

SELECT title, release_year
FROM films
ORDER BY release_year DESC, title ASC;
|title           |release_year|
|----------------|------------|
|10,000 B.C.     |            |
|A Touch of Frost|            |    
|Anger Management|            |    
|Animal Kingdom  |            |
...
Intermediate SQL with AI

Missing values

SELECT COUNT(*) AS total_count
FROM films;
|total_count|
|-----------|
|4968       |
  • Counts all records
SELECT COUNT(language) AS language_count
FROM films;
|total_count|
|-----------|
|4957       |
  • Counts non-missing values
Intermediate SQL with AI

Missing values

$$

  • Break calculations
  • Skew averages
  • Hide patterns

$$

✅ Identify, select, exclude missing values

❌ Fixing or replacing missing values

A broken document

Intermediate SQL with AI

Missing values: NULLs

Causes:

  • Skipping fields
  • Systems failing
  • Values not recorded

$$

Missing values:

  • NULL: true missing value
  • " " or "None"

Missing puzzle piece

Intermediate SQL with AI

IS NULL

Prompt: Show film titles and languages with no language

SELECT title, language
FROM films
WHERE language IS NULL;
|title               |language|
|--------------------|--------|
|Intolerance: Love...|        |
|Over the Hill to ...|        |
|The Big Parade      |        |
...
Intermediate SQL with AI

IS NOT NULL

Prompt: Show film titles and languages that have a language defined

SELECT title, language
FROM films
WHERE language IS NOT NULL;
|title              |language|
|-------------------|--------|
|Metropolis         |German  |
|Pandora's Box      |German  |
|The Broadway Melody|English |
|Quest for Fire     |None    |
...
Intermediate SQL with AI

Identifying other unknowns

Question mark

$$

  • IS NULL and IS NOT NULL only catch true missing values

$$

  • What about "Not available", "None"?
    • Not true missing NULL values
    • But still missing or unknown
Intermediate SQL with AI

Identifying other unknowns

Prompt: Show film titles and languages where the language is 'None'

SELECT title, language
FROM films
WHERE language = 'None';
|title              |language|
|-------------------|--------|
|Quest for Fire     |None    |
|Samsara            |None    |
Intermediate SQL with AI

Finding variations

Prompt: Show all unique language values

SELECT DISTINCT language
FROM films;
|language|
|--------|
|        |
|Arabic  |
...
|None    |
...
Intermediate SQL with AI

Filtering variations

Prompt: Show film titles and languages where the language is not null or None

SELECT title, language
FROM films
WHERE language IS NOT NULL AND language != 'None';
|title              |language|
|-------------------|--------|
|Metropolis         |German  |
|Pandora's Box      |German  |
|The Broadway Melody|English |
...
Intermediate SQL with AI

Working with zero

$$

  • Zero:
    • Missing score?
    • 0 score?

$$

  • Context required:
    • Documentation
    • Business guidelines

Zero

Intermediate SQL with AI

Handling missing values

  • Important analysis skill
  • Can impact conclusions

$$

  • Spotting missing values:
    • Explore fields to identify missing values
    • Filter as needed
    • Aggregate functions ignore NULL but include zero
    • COUNT(*) includes all rows, including NULL

Magnifying glass

Intermediate SQL with AI

Ready for practice?

Intermediate SQL with AI

Preparing Video For Download...