CASE WHEN with aggregate functions

Data Manipulation in SQL

Mona Khalil

Data Scientist, Greenhouse Software

In CASE you need to aggregate

  • CASE statements are great for
    • Categorizing data
    • Filtering data
    • Aggregating data
Data Manipulation in SQL

COUNTing CASES

  • How many home and away goals did Liverpool score in each season?
| season    | home_wins | away_wins |
|-----------|-----------|-----------|
| 2011/2012 |           |           |
| 2012/2013 |           |           |
| 2013/2014 |           |           |
| 2014/2015 |           |           |
Data Manipulation in SQL

CASE WHEN with COUNT

SELECT
    season,
    COUNT(CASE WHEN hometeam_id = 8650 
               AND home_goal > away_goal 
               THEN id END) AS home_wins
FROM match
GROUP BY season;
Data Manipulation in SQL

CASE WHEN with COUNT

SELECT
    season,
    COUNT(CASE WHEN hometeam_id = 8650 AND home_goal > away_goal 
          THEN id END) AS home_wins,
    COUNT(CASE WHEN awayteam_id = 8650 AND away_goal > home_goal 
          THEN id END) AS away_wins
FROM match
GROUP BY season;
| season    | home_wins | away_wins |
|-----------|-----------|-----------|
| 2011/2012 | 6         | 8         |
| 2012/2013 | 9         | 7         |
| 2013/2014 | 16        | 10        |
| 2014/2015 | 10        | 8         |
Data Manipulation in SQL

CASE WHEN with COUNT

SELECT
    season,
    COUNT(CASE WHEN hometeam_id = 8650 AND home_goal > away_goal 
          THEN 54321 END) AS home_wins,
    COUNT(CASE WHEN awayteam_id = 8650 AND away_goal > home_goal 
          THEN 'Some random text' END) AS away_wins
FROM match
GROUP BY season;
| season    | home_wins | away_wins |
|-----------|-----------|-----------|
| 2011/2012 | 6         | 8         |
| 2012/2013 | 9         | 7         |
| 2013/2014 | 16        | 10        |
| 2014/2015 | 10        | 8         |
Data Manipulation in SQL

CASE WHEN with SUM

SELECT
    season,
    SUM(CASE WHEN hometeam_id = 8650  
          THEN home_goal END) AS home_goals,
    SUM(CASE WHEN awayteam_id = 8650  
          THEN away_goal END) AS away_goals
FROM match
GROUP BY season;
| season    | home_goals | away_goals |
|-----------|------------|------------|
| 2011/2012 | 24         | 23         |
| 2012/2013 | 33         | 38         |
| 2013/2014 | 53         | 48         |
| 2014/2015 | 30         | 22         |
Data Manipulation in SQL

The CASE is fairly AVG...

SELECT
    season,
    AVG(CASE WHEN hometeam_id = 8650  
          THEN home_goal END) AS avg_homegoals,
    AVG(CASE WHEN awayteam_id = 8650  
          THEN away_goal END) AS avg_awaygoals
FROM match
GROUP BY season;
| season    | avg_homegoals    | avg_awaygoals    |
|-----------|------------------|------------------|
| 2011/2012 | 1.26315789473684 | 1.21052631578947 |
| 2012/2013 | 1.73684210526316 | 2                |
| 2013/2014 | 2.78947368421053 | 2.52631578947368 |
| 2014/2015 | 1.57894736842105 | 1.15789473684211 |
Data Manipulation in SQL

A ROUNDed AVG

ROUND(3.141592653589,2)
3.14
Data Manipulation in SQL

A ROUNDed AVG

SELECT 
  season,
  ROUND(AVG(CASE WHEN hometeam_id = 8650  
        THEN home_goal END),2) AS avg_homegoals,
  ROUND(AVG(CASE WHEN awayteam_id = 8650  
        THEN away_goal END),2) AS avg_homegoals
FROM match
GROUP BY season;
| season    | avg_homegoals    | avg_awaygoals    |
|-----------|------------------|------------------|
| 2011/2012 | 1.26             | 1.21             |
| 2012/2013 | 1.73             | 2                |
| 2013/2014 | 2.78             | 2.52             |
| 2014/2015 | 1.57             | 1.15             |
Data Manipulation in SQL

Percentages with CASE and AVG

SELECT
  season,
  AVG(CASE WHEN hometeam_id = 8455 AND home_goal > away_goal THEN 1
           WHEN hometeam_id = 8455 AND home_goal < away_goal THEN 0
           END) AS pct_homewins,
  AVG(CASE WHEN awayteam_id = 8455 AND away_goal > home_goal THEN 1
           WHEN awayteam_id = 8455 AND away_goal < home_goal THEN 0
           END) AS pct_awaywins
FROM match
GROUP BY season;
| season    | pct_homewins     | pct_awaywins     |
|-----------|------------------|------------------|
| 2011/2012 | 0.75             | 0.5              |
| 2012/2013 | 0.85714285714286 | 0.66666666666667 |
| 2013/2014 | 0.9375           | 0.66666666666667 |
| 2014/2015 | 1                | 0.78571428571429 |
Data Manipulation in SQL

Percentages with CASE and AVG

SELECT
  season,
  ROUND(AVG(CASE WHEN hometeam_id = 8455 AND home_goal > away_goal THEN 1
           WHEN hometeam_id = 8455 AND home_goal < away_goal THEN 0
           END),2) AS pct_homewins,
  ROUND(AVG(CASE WHEN awayteam_id = 8455 AND away_goal > home_goal THEN 1
           WHEN awayteam_id = 8455 AND away_goal < home_goal THEN 0
           END),2) AS pct_awaywins
FROM match
GROUP BY season;
| season    | pct_homewins     | pct_awaywins     |
|-----------|------------------|------------------|
| 2011/2012 | 0.75             | 0.5              |
| 2012/2013 | 0.86             | 0.67             |
| 2013/2014 | 0.94             | 0.67             |
| 2014/2015 | 1                | 0.79             |
Data Manipulation in SQL

Let's practice!

Data Manipulation in SQL

Preparing Video For Download...