Activity of zebrafish and melatonin

Case Studies in Statistical Thinking

Justin Bois

Lecturer, Caltech

Case Studies in Statistical Thinking

Case studies in statistical thinking

  • Hone and extend your statistical thinking skills
  • Work with real datasets
  • Review of Statistical Thinking in Python (Part 1) and (Part 2)
Case Studies in Statistical Thinking

Warming up with zebrafish

1 Movie courtesy of David Prober, Caltech
Case Studies in Statistical Thinking

Nomenclature

  • Mutant: Has the mutation on both chromosomes

  • Wild type: Does not have the mutation
Case Studies in Statistical Thinking

Activity of fish, day and night

1 Data courtesy of Avni Gandhi, Grigorios Oikonomou, and David Prober, Caltech
Case Studies in Statistical Thinking

Active bouts: a metric for wakefulness

  • Active bout: A period of time where a fish is consistently active

  • Active bout length: Number of consecutive minutes with activity
Case Studies in Statistical Thinking

Probability distributions and stories

  • Probability distribution: A mathematical description of outcomes

  • A probability distribution has a story
Case Studies in Statistical Thinking

Distributions from Statistical Thinking I

  • Uniform
  • Binomial
  • Poisson
  • Normal
  • Exponential
Case Studies in Statistical Thinking

The Exponential distribution

  • Poisson process: The timing of the next event is completely independent of when the previous event happened

  • Story of the Exponential distribution: The waiting time between arrivals of a Poisson process is Exponentially distributed
Case Studies in Statistical Thinking

The Exponential CDF

x, y = ecdf(nuclear_incident_times)

_ = plt.plot(x, y, marker='.', linestyle='none')

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Case Studies in Statistical Thinking

The Exponential CDF

x, y = ecdf(nuclear_incident_times)

_ = plt.plot(x, y, marker='.', linestyle='none')

1 Data source: Wheatley, Sovacool, and Sornette, Nuclear Events Database
Case Studies in Statistical Thinking
 import dc_stat_think as dcst

dcst.pearson_r?
 Signature: dcst.pearson_r(data_1, data_2)
 Docstring: Compute the Pearson correlation coefficient between two 
 samples.
 Parameters
 ----------
 data_1 : array_like
     One-dimensional array of data.
 data_2 : array_like
     One-dimensional array of  data.
 Returns
 -------
 output : float
     The Pearson correlation coefficient between `data_1`
     and `data_2`.
 File:      usr/local/lib/python3.5/site-packages/
            dc_stat_think-0.1.4-py3.6.egg/dc_stat_think/dc_stat_think.py
 Type:      function
Case Studies in Statistical Thinking

Using the dc_stat_think module

x, y = dcst.ecdf(nuclear_incident_times)

% pip install dc_stat_think
Case Studies in Statistical Thinking

Let's practice!

Case Studies in Statistical Thinking

Preparing Video For Download...