Reshaping Data with pandas
Maria Eugenia Inzaugarat
Data Scientist
fifa_players = pd.read_csv("fifa_players.csv")
fifa_players
name age nationality club
0 Lionel Messi 32 Argentina Barcelona
1 Cristiano Ronaldo 34 Portugal Juventus
2 Neymar da Silva 27 Brazil Saint-Germain
fifa_players.shape
(3, 4)
fifa_players
name age nationality club
0 Lionel Messi 32 Argentina Barcelona
1 Cristiano Ronaldo 34 Portugal Juventus
2 Neymar da Silva 27 Brazil Saint-Germain
fifa_players
vv
name | age | nationality club
0 Lionel Messi | 32 | Argentina Barcelona
1 Cristiano Ronaldo | 34 | Portugal Juventus
2 Neymar da Silva | 27 | Brazil Saint-Germain
^^
fifa_players
name age nationality club
0 Lionel Messi 32 Argentina Barcelona <--
1 Cristiano Ronaldo 34 Portugal Juventus <--
2 Neymar da Silva 27 Brazil Saint-Germain <--
fifa_players
name age nationality club
0 Lionel Messi 32 Argentina Barcelona
--------------------------------------------------------
1 Cristiano Ronaldo NaN <- Portugal Juventus
--------------------------------------------------------
2 Neymar da Silva 27 Brazil Saint-Germain
fifa_players_long.head()
name variable value
0 Cristiano Ronaldo nationality Portugal
1 Cristiano Ronaldo club Juventus
2 Lionel Messi age 32
3 Lionel Messi nationality Argentina
4 Lionel Messi club Barcelona
fifa_players_long.head()
name variable value
0 Cristiano Ronaldo nationality Portugal <--
1 Cristiano Ronaldo club Juventus
2 Lionel Messi age 32
3 Lionel Messi nationality Argentina <--
4 Lionel Messi club Barcelona
fifa_players_long.head()
name variable value
0 Cristiano Ronaldo nationality Portugal <--
1 Cristiano Ronaldo club Juventus <--
2 Lionel Messi age 32
3 Lionel Messi nationality Argentina
4 Lionel Messi club Barcelona
fifa_players_long.head()
| name | variable value
0 | Cristiano Ronaldo | nationality Portugal
1 | Cristiano Ronaldo | club Juventus
2 | Lionel Messi | age 32
3 | Lionel Messi | nationality Argentina
4 | Lionel Messi | club Barcelona
^^^^^^^^^^^
name
) to identify same playerfifa_players_long.head()
name variable value
0 Cristiano Ronaldo nationality Portugal
1 Cristiano Ronaldo club Juventus
2 Lionel Messi age 32
3 Lionel Messi nationality Argentina
4 Lionel Messi club Barcelona
name
) to identify same playerfifa_players.set_index('club')
name age nationality
club
Barcelona Lionel Messi 32 Argentina
Juventus Cristiano Ronaldo NaN Portugal
Saint-Germain Neymar da Silva 27 Brazil
fifa_players.set_index('club')[['name', 'nationality']]
name nationality
club
Barcelona Lionel Messi Argentina
Juventus Cristiano Ronaldo Portugal
Saint-Germain Neymar da Silva Brazil
fifa_players.set_index('club')[['name', 'nationality']].transpose()
club Barcelona Juventus Saint-Germain
name Lionel Messi Cristiano Ronaldo Neymar da Silva
nationality Argentina Portugal Brazil
Performed using pandas
functions, such as:
.melt()
.wide_to_long()
Transform data using pandas
methods, for example:
.pivot()
.pivot_table()
Reshaping Data with pandas