Using windows for calculating statistics

Intermediate SQL Server

Ginger Grant

Instructor

Calculating the standard deviation

  • Calculate standard deviation either for the entire table or for each window
  • STDEV() calculates the standard deviation
Intermediate SQL Server

Calculating the standard deviation for the entire table

SELECT SalesPerson, SalesYear, CurrentQuota, 
       STDEV(CurrentQuota) 
       OVER () AS StandardDev, 
       ModifiedDate  AS ModDate
FROM SaleGoal
+------------+----------+------------+-----------------+----------+
|SalesPerson |SalesYear |CurrentQuota|StandardDev      | ModDate  |
+------------+----------+------------+-----------------+----------+
|Bob         |2011      |28000.00    |267841.370964233 |2011-04-16|
|Bob         |2011      |7000.00     |267841.370964233 |2011-07-17| 
|Bob         |2011      |91000.00    |267841.370964233 |2011-10-17|
|Bob         |2012      |140000.00   |267841.370964233 |2012-01-15|
|Bob         |2012      |70000.00    |267841.370964233 |2012-04-15|
...
+------------+----------+------------+-----------------+----------+
Intermediate SQL Server

Calculating the standard deviation for each partition

SELECT SalesPerson, SalesYear, CurrentQuota,  
       STDEV(CurrentQuota) 
       OVER (PARTITION BY SalesYear ORDER BY SalesYear) AS StDev, 
       ModifiedDate AS ModDate
FROM SaleGoal
+------------+----------+------------+-------------+----------+
|SalesPerson |SalesYear |CurrentQuota|StDev        | ModDate  |
+------------+----------+------------+-------------+----------+
|Bob         |2011      |28000.00    |267841.54080 |2011-04-16|
|Bob         |2011      |7000.00     |267841.54080 |2011-07-17| 
|Mary        |2011      |91000.00    |267841.54080 |2011-04-16|
|Mary        |2011      |140000.00   |267841.54080 |2011-07-15|
|Bob         |2012      |70000.00    |246538.86248 |2012-01-15|
|Bob         |2012      |154000.00   |246538.86248 |2012-04-15|
|Bob         |2012      |107000.00   |246538.86248 |2012-07-16|
...
+------------+----------+------------+------------+-----------+
Intermediate SQL Server

Calculating the mode

  • Mode is the value which appears the most often in your data
  • To calculate mode:
    • Create a CTE containing an ordered count of values using ROW_NUMBER
    • Write a query using the CTE to pick the value with the highest row number
Intermediate SQL Server

Calculating the mode in T-SQL (I)

WITH QuotaCount AS (
SELECT SalesPerson, SalesYear, CurrentQuota,  
       ROW_NUMBER() 
       OVER (PARTITION BY CurrentQuota ORDER BY CurrentQuota) AS QuotaList
FROM SaleGoal 
)
SELECT * FROM QuotaCount 
+------------+----------+------------+-------------+
|SalesPerson |SalesYear |CurrentQuota|QuotaList    | 
+------------+----------+------------+-------------+
|Bob         |2011      |7000.00     |1            |
|Bob         |2011      |28000.00    |1            | 
|Bob         |2011      |70000.00    |1            |
|Bob         |2012      |70000.00    |2            |
|Mary        |2012      |73000.00    |1            |
...
+------------+----------+------------+-------------+
  • Notice there are two values for 70,000.00
Intermediate SQL Server

Calculating the mode in T-SQL (II)

WITH QuotaCount AS (
SELECT SalesPerson, SalesYear, CurrentQuota, 
       ROW_NUMBER() 
       OVER (PARTITION BY CurrentQuota ORDER BY CurrentQuota) AS QuotaList
FROM SaleGoal 
)

SELECT CurrentQuota, QuotaList AS Mode 
FROM QuotaCount 
WHERE QuotaList IN (SELECT  MAX(QuotaList) FROM QuotaCount)
+------------+----------+
|CurrentQuota|Mode      | 
+------------+----------+
|70000.00    |2         |
+------------+----------+
Intermediate SQL Server

Let's practice!

Intermediate SQL Server

Preparing Video For Download...