Unstacking DataFrames

Reshaping Data with pandas

Maria Eugenia Inzaugarat

Data Scientist

Review

Two DataFrames rearrange and stack method

Reshaping Data with pandas

Undoing stacking process

Arrow pointing from a stacked to an unstack DataFrame

Reshaping Data with pandas

The .unstack() method

Reverse process of stack

  unstack method

Reshaping Data with pandas

The .unstack() method

Rearrange a level of the row index into the columns to obtain a reshaped DataFrame with a new inner-most level column index.

Rearranged columns highlighted

Reshaping Data with pandas

Unstack Series

churn_stacked
member  credit_card              
yes     no           credit_score        619
                     age                  43
                     country          France
                     num_products          1
                     churn               Yes
no      yes          credit_score        608
                     age                  34
                     country         Germany
                     num_products          0
                     churn                No
yes     yes          credit_score        502
                     age                  23
                     country          France
                     num_products          1
                     churn               Yes
Reshaping Data with pandas

Unstack Series

churned_stacked.unstack()
                    credit_score age  country  num_products exited
member credit_card
    no         yes           608  34  Germany             0     No
   yes          no           619  43   France             1    Yes
               yes           502  23   France             1    Yes
Reshaping Data with pandas

Unstacking a DataFrame

patients_stacked
                  year  2019 2020
  first   last feature          
   Wick   John     age    25   26 
                weight    68   72
        Julien     age    31   32
                weight    72   73
Shelley   Mary     age    41   42
                weight    68   69
         Frank     age    32   33
                weight    75   74
Reshaping Data with pandas

Unstacking a DataFrame

patients_stacked.unstack()
                       2019       2020
        feature  age weight age weight
   last   first
Shelley   Frank   32     75  33     74
           Mary   41     68  42     69
   Wick    John   25     68  26     72
         Julien   31     72  32     73
Reshaping Data with pandas

Unstack a level

DataFrame with chosen level unstacked

Unstack method with level specified

Reshaping Data with pandas

Unstack level by number

churn_stacked.head(10)
member  credit_card              
yes     no           credit_score        619
                     age                  43
                     country          France
                     num_products          1
                     churn               Yes
no      yes          credit_score        608
                     age                  34
                     country         Germany
                     num_products          0
                     churn                No
churn_stacked.unstack(level=0)
                   member     no    yes
credit_card            
         no credit_score     NaN    619
                     age     NaN     43
                 country     NaN France
            num_products     NaN      1
                   churn     NaN    Yes
        yes credit_score     608    502
                     age      34     23
                 country Germany France
            num_products       0      1
                   churn      No    Yes
Reshaping Data with pandas

Unstack level by name

churn_stacked.head(10)
member  credit_card              
yes     no           credit_score        619
                     age                  43
                     country          France
                     num_products          1
                     churn               Yes
no      yes          credit_score        608
                     age                  34
                     country         Germany
                     num_products          0
                     churn                No
churn_stacked.unstack(level='credit_card')
             credit_card      no     yes
     member            
         no credit_score     NaN     608
                     age     NaN      34
                 country     NaN Germany
            num_products     NaN       0
                   churn     NaN      No
        yes credit_score     619     NaN
                     age      43     NaN
                 country  France     NaN
            num_products       1     NaN
                   churn     Yes     NaN
Reshaping Data with pandas

Sort index

patients_stacked.unstack().sort_index(ascending=False)
           year         2019       2020
        feature   age weight age weight 
   last   first
   Wick  Julien    31     72  32     73 
           John    25     68  26     72
Shelley    Mary    41     68  42     69
          Frank    32     75  33     74
Reshaping Data with pandas

Rearranging levels

patients_stacked
                  year  2019 2020
  first   last feature          
   Wick   John     age    25   26 
                weight    68   72
        Julien     age    31   32
                weight    72   73
Shelley   Mary     age    41   42
                weight    68   69
         Frank     age    32   33
                weight    75   74
patients_stacked.unstack(level=1).stack(level=0)
first                 Frank  John  Julien  Mary
 last   feature year                           
Shelley age     2019   32.0   NaN     NaN  41.0
                2020   33.0   NaN     NaN  42.0
        weight  2019   75.0   NaN     NaN  68.0
                2020   74.0   NaN     NaN  69.0
Wick    age     2019    NaN  25.0    31.0   NaN
                2020    NaN  26.0    32.0   NaN
        weight  2019    NaN  68.0    72.0   NaN
                2020    NaN  72.0    73.0   NaN
Reshaping Data with pandas

Let's practice!

Reshaping Data with pandas

Preparing Video For Download...