Grouping data

Intermediate SQL with AI

Jasmin Ludolf

Senior Data Science Content Developer

Grouping data

$$

  • Insights for particular groups

$$

  • Combine rows with same values

$$

  • Create summaries to reveal patterns

Collection of language books

Intermediate SQL with AI

Distinct vs. groups

SELECT DISTINCT language
FROM films;
|language|        
|--------|
|Danish  |
|Greek   |
|Dzongkha|
|...     |
  • Unique values
  • What languages do we have?
SELECT language
FROM films
GROUP BY language;
|language|        
|--------|
|Danish  |
|Greek   |
|Dzongkha|
|...     |
  • Creates groups
  • How many films do we have in each language?
Intermediate SQL with AI

Prompts for grouping

Prompt: Group films by language

SELECT language, COUNT(*) AS lang_count
FROM films
GROUP BY language;
|language|lang_count|
|--------|----------|
|        |11        |
|Arabic  |5         |
|Kannada |1         |
|Czech   |1         |
...

$$

  • Grouping is common with summary stats:
    • e.g. counting, averages

$$

  • No specific request -> COUNT()
    • Aggregate function
    • Summarizes multiple rows into a single value
Intermediate SQL with AI

Common grouping errors

Prompt: Show all film titles grouped by language

SELECT title
FROM films
GROUP BY language;
column "films.title" must appear in the
GROUP BY clause or be used in an 
aggregate function LINE 1...

AI corrections:

  • Adding a COUNT() function
    SELECT language, COUNT(title) AS count
    FROM films
    GROUP BY language;
    
  • Selecting correct fields
    SELECT title, language
    FROM films
    GROUP BY title, language;
    
Intermediate SQL with AI

Grouping prompts

Prompt: Count films by country and language

SELECT country, language, COUNT(*) AS film_count
FROM films
GROUP BY country, language;
|country           |language|film_count|
|------------------|--------|----------|
|                  |        |1         |
|Pakistan          |Urdu    |1         |
|Mexico            |English |3         |
|Dominican Republic|Spanish |1         |
...
Intermediate SQL with AI

Grouping and sorting

Prompt: Rank languages by number of films, showing the top 3

SELECT language, COUNT(*) AS film_count
FROM films
GROUP BY language
ORDER BY film_count DESC
LIMIT 3;
|language|film_count|
|--------|----------|
|English |4635      |
|French  |72        |
|Spanish |40        |
Intermediate SQL with AI

Let's practice!

Intermediate SQL with AI

Preparing Video For Download...