Exploratory Data Analysis in SQL
Christina Maimone
Data Scientist
SELECT corr(assets, equity)
FROM fortune500;
corr
-------------------
0.637710143588615
(1 row)
1 1 4 4 4 5 6 7 13 19 20 20 21 21 22
^
median
50th percentile
^ ^
0th percentile 100th percentile
SELECT percentile_disc(percentile) WITHIN GROUP (ORDER BY column_name)
FROM table;
-- percentile between 0 and 1
SELECT percentile_cont(percentile) WITHIN GROUP (ORDER BY column_name)
FROM table;
SELECT val
FROM nums;
val
-----
1
3
4
5
(4 rows)
SELECT percentile_disc(.5) WITHIN GROUP (ORDER BY val),
percentile_cont(.5) WITHIN GROUP (ORDER BY val)
FROM nums;
percentile_disc | percentile_cont
-----------------+-----------------
3 | 3.5
Exploratory Data Analysis in SQL