Reshaping Data with pandas
Maria Eugenia Inzaugarat
Data Scientist
another_fifa.head()
name variable metric_system imperial_system
0 Cristiano Ronaldo weight 83 183.00
1 J. Oblak weight 87 191.00
2 Cristiano Ronaldo height 187 6.13
3 J. Oblak height 188 6.16
4 Cristiano Ronaldo height 187 6.14
another_fifa.pivot(index="name", columns="variable")
Traceback (most recent call last):
ValueError: Index contains duplicate entries, cannot reshape
another_fifa.pivot_table(index="name", columns="variable", aggfunc="mean")
metric_system imperial_system
variable height weight height weight
name
Cristiano Ronaldo 187 83 6.135 183.0
J. Oblak 188 87 6.160 191.0
fifa_players.head(6)
first last movement overall attacking
0 Lionel Messi shooting 92 70
1 Cristiano Ronaldo shooting 93 89
2 Lionel Messi passing 92 92
3 Cristiano Ronaldo passing 82 83
4 Lionel Messi passing 96 88
5 Cristiano Ronaldo passing 89 84
fifa_players.head(6)
first last movement overall attacking
0 Lionel Messi shooting 92 70
1 Cristiano Ronaldo shooting 93 89
2 Lionel Messi passing 92 92
3 Cristiano Ronaldo passing 82 83
4 Lionel Messi passing 96 88
5 Cristiano Ronaldo passing 89 84
fifa_players.pivot_table(index= , columns="movement", values= , aggfunc= )
fifa_players.head(6)
first last movement overall attacking
0 Lionel Messi shooting 92 70
1 Cristiano Ronaldo shooting 93 89
2 Lionel Messi passing 92 92
3 Cristiano Ronaldo passing 82 83
4 Lionel Messi passing 96 88
5 Cristiano Ronaldo passing 89 84
fifa_players.pivot_table(index=["first", "last"], columns="movement", values= , aggfunc= )
fifa_players.head(6)
first last movement overall attacking
0 Lionel Messi shooting 92 70
1 Cristiano Ronaldo shooting 93 89
2 Lionel Messi passing 92 92
3 Cristiano Ronaldo passing 82 83
4 Lionel Messi passing 96 88
5 Cristiano Ronaldo passing 89 84
fifa_players.pivot_table(index=["first", "last"], columns="movement", values=["overall", "attacking"], aggfunc="max")
attacking overall
movement passing shooting passing shooting
first last
Cristiano Ronaldo 84 89 89 93
Lionel Messi 92 70 96 92
fifa_players.pivot_table(index=["first", "last"], columns="movement", aggfunc="count", )
fifa_players.pivot_table(index=["first", "last"], columns="movement", aggfunc="count", margins=True)
attacking overall
movement passing shooting All passing shooting All
First Last
Cristiano Ronaldo 2 1 3 2 1 3
Lionel Messi 2 1 3 2 1 3
All 4 2 6 4 2 6
Does the DataFrame have more than one value for each index/column pair?
Do you need to have a multi-index in your resulting pivoted DataFrame?
Do you need summary statistics of your large DataFrame?
Yes! Use .pivot_table()
Reshaping Data with pandas