pandas-DataFrames filtern

Python für Fortgeschrittene

Hugo Bowne-Anderson

Data Scientist at DataCamp

BRICS

import pandas as pd
brics = pd.read_csv("path/to/brics.csv", index_col = 0)
brics
         country    capital    area  population
BR        Brazil   Brasilia   8.516      200.40
RU        Russia     Moscow  17.100      143.50
IN         India  New Delhi   3.286     1252.00
CH         China    Beijing   9.597     1357.00
SA  South Africa   Pretoria   1.221       52.98
Python für Fortgeschrittene

Ziel

         country    capital    area  population
BR        Brazil   Brasilia   8.516      200.40
RU        Russia     Moscow  17.100      143.50
IN         India  New Delhi   3.286     1252.00
CH         China    Beijing   9.597     1357.00
SA  South Africa   Pretoria   1.221       52.98
  • Länder mit einer Fläche von über 8 Millionen Quadratkilometer
  • Drei Schritte
    • Auswahl der Spalte „area“
    • Vergleiche auf der Spalte „area“ anstellen
    • Ergebnis zur Auswahl der Länder nutzen
Python für Fortgeschrittene

Schritt 1: Auswahl der Spalte

         country    capital    area  population
BR        Brazil   Brasilia   8.516      200.40
RU        Russia     Moscow  17.100      143.50
IN         India  New Delhi   3.286     1252.00
CH         China    Beijing   9.597     1357.00
SA  South Africa   Pretoria   1.221       52.98
brics["area"]
BR     8.516
RU    17.100
IN     3.286
CH     9.597
SA     1.221
Name: area, dtype: float64    # - Need Pandas Series
  • Alternativen:
brics.loc[:,"area"]
brics.iloc[:,2]
Python für Fortgeschrittene

Schritt 2: Vergleichen

brics["area"]
BR     8.516
RU    17.100
IN     3.286
CH     9.597
SA     1.221
Name: area, dtype: float64
brics["area"] > 8
BR     True
RU     True
IN    False
CH     True
SA    False
Name: area, dtype: bool
is_huge = brics["area"] > 8
Python für Fortgeschrittene

Schritt 3: Teilmenge des Dataframe bilden

is_huge
BR     True
RU     True
IN    False
CH     True
SA    False
Name: area, dtype: bool
brics[is_huge]
   country   capital    area  population
BR  Brazil  Brasilia   8.516       200.4
RU  Russia    Moscow  17.100       143.5
CH   China   Beijing   9.597      1357.0
Python für Fortgeschrittene

Zusammenfassung

         country    capital    area  population
BR        Brazil   Brasilia   8.516      200.40
RU        Russia     Moscow  17.100      143.50
IN         India  New Delhi   3.286     1252.00
CH         China    Beijing   9.597     1357.00
SA  South Africa   Pretoria   1.221       52.988
is_huge = brics["area"] > 8
brics[is_huge]
   country   capital    area  population
BR  Brazil  Brasilia   8.516       200.4
RU  Russia    Moscow  17.100       143.5
CH   China   Beijing   9.597      1357.0
brics[brics["area"] > 8]
   country   capital    area  population
BR  Brazil  Brasilia   8.516       200.4
RU  Russia    Moscow  17.100       143.5
CH   China   Beijing   9.597      1357.0
Python für Fortgeschrittene

Boolesche Operatoren

         country    capital    area  population
BR        Brazil   Brasilia   8.516      200.40
RU        Russia     Moscow  17.100      143.50
IN         India  New Delhi   3.286     1252.00
CH         China    Beijing   9.597     1357.00
SA  South Africa   Pretoria   1.221       52.98
import numpy as np
np.logical_and(brics["area"] > 8, brics["area"] < 10)
BR     True
RU    False
IN    False
CH     True
SA    False
Name: area, dtype: bool
brics[np.logical_and(brics["area"] > 8, brics["area"] < 10)]
   country   capital   area  population
BR  Brazil  Brasilia  8.516       200.4
CH   China   Beijing  9.597      1357.0
Python für Fortgeschrittene

Lass uns üben!

Python für Fortgeschrittene

Preparing Video For Download...