Filtering grouped data

Intermediate SQL with AI

Jasmin Ludolf

Senior Data Science Content Developer

WHERE or HAVING

Which languages have more than 20 films?

  • Language count: aggregated and grouped

 

WHERE

  • Individual rows
  • Which individual films meet this condition?

$$

$$

$$

HAVING

  • Grouped results
  • Which groups meet this condition?
Intermediate SQL with AI

Filtering grouped data

Prompt: Which languages have more than 20 films?

SELECT language, COUNT(*) AS lang_count
FROM films
GROUP BY language
HAVING COUNT(*) > 20;
|language|
|--------|
|Mandarin|
|Spanish |
|French  |
|Hindi   |
|English |

$$

  • Filters after:
    • Grouping
    • Aggregations
    • When counts are available
Intermediate SQL with AI

Prompting for group filters

Prompt: Film titles and budgets with a budget above 5 million

SELECT title, budget
FROM films
WHERE budget > 5000000;
|title          |budget |
|---------------|-------|
|Metropolis     |6000000|
|Duel in the Sun|8000000|
|Quo Vadis      |7623000|
|West Side Story|6000000|
...

Prompt: Languages and budget with rounded average budget above 5 million

SELECT language, ROUND(AVG(budget)) AS avg_budget
FROM films
GROUP BY language
HAVING ROUND(AVG(budget)) > 5000000;
|language|avg_budget|
|--------|----------|
|Danish  |16700000  |
|None    |8250000   |
|Tamil   |150000000 |
...
Intermediate SQL with AI

Combining filters

Prompt: Show languages with over 5 films from 2000 onwards, where average duration exceeds 80

Intermediate SQL with AI

Combining filters

Prompt: Show languages with over 5 films from 2000 onwards, where average duration exceeds 80

SELECT language, COUNT(*) AS film_count, ROUND(AVG(duration)) AS average_duration
FROM films

WHERE release_year >= 2000
GROUP BY language
Intermediate SQL with AI

Combining filters

Prompt: Show languages with over 5 films from 2000 onwards, where average duration exceeds 80

SELECT language, COUNT(*) AS film_count, ROUND(AVG(duration)) AS average_duration
FROM films
WHERE release_year >= 2000
GROUP BY language

Intermediate SQL with AI

Combining filters

Prompt: Show languages with over 5 films from 2000 onwards, where average duration exceeds 80

SELECT language, COUNT(*) AS film_count, ROUND(AVG(duration)) AS average_duration
FROM films
WHERE release_year >= 2000
GROUP BY language

HAVING COUNT(*) > 5 AND ROUND(AVG(duration)) > 80;
|language|film_count|average_duration|
|--------|----------|----------------|
|Mandarin|26        |112             |
|Japanese|10        |114             |
|Spanish |37        |107             |
...
Intermediate SQL with AI

Order of operations

$$

FROM

An assembly line

Intermediate SQL with AI

Order of operations

$$

FROMWHERE

An assembly line

Intermediate SQL with AI

Order of operations

$$

FROMWHEREGROUP BY (and aggregation) ➝

An assembly line

Intermediate SQL with AI

Order of operations

$$

FROMWHEREGROUP BY (and aggregation) ➝ HAVING

An assembly line

Intermediate SQL with AI

Order of operations

$$

FROMWHEREGROUP BY (and aggregation) ➝ HAVINGSELECT

An assembly line

Intermediate SQL with AI

Order of operations

$$

FROMWHEREGROUP BY (and aggregation) ➝ HAVINGSELECTORDER BY

An assembly line

Intermediate SQL with AI

Order of operations

$$

FROMWHEREGROUP BY (and aggregation) ➝ HAVINGSELECTORDER BYLIMIT

An assembly line

Intermediate SQL with AI

Filter prompts

Both: Films from 2000 onwards where the average duration is above 80

WHERE: Individual records

  • Film titles released after 2024
  • Budgets above 5 million

HAVING: Mentions aggregate

  • Languages with average budget of over 10000
  • Count of films greater than 5

Pen on paper

Intermediate SQL with AI

Iterative prompting

$$

1) Start small

2) Check the results

3) Prompt again to add more detail

4) Repeat!

3D illustration of a speech bubbles, one with AI and one with ellipsis

Intermediate SQL with AI

Time to practice!

Intermediate SQL with AI

Preparing Video For Download...