Apa itu pandas?

Pengantar Data Science dengan Python

Hillary Green-Lerman

Lead Data Scientist, Looker

Apa yang dapat dilakukan pandas?

  • Memuat data tabular dari berbagai sumber
  • Mencari baris atau kolom tertentu
  • Menghitung statistik agregat
  • Menggabungkan data dari beberapa sumber
Pengantar Data Science dengan Python

Data tabular dengan pandas

Data Tabular

+-------------------------------------------------+
|        suspect        |     location    | price |
+-----------------------+-----------------+-------+
| Fred Frequentist      | Petroleum Plaza | 24.95 |
| Ronald Aylmer Fisher  | Clothing Club   | 20.15 |
+-------------------------------------------------+

DataFrame

                suspect        location  price
0      Fred Frequentist  Perolium Plaza  24.95
1  Ronald Aylmer Fisher   Clothing Club  20.15
Pengantar Data Science dengan Python

File CSV

Pengantar Data Science dengan Python

Memuat CSV

import pandas as pd
df = pd.read_csv('ransom.csv')
Pengantar Data Science dengan Python

Menampilkan DataFrame

df = pd.read_csv('filename.csv')

print(df)
                 suspect         location          item  price
0         Kirstine Smith  Petroleum Plaza           gas  24.95
1       Fred Frequentist      Burger Mart         fries   1.95
2           Gertrude Cox      Burger Mart         fries   1.95
3   Ronald Aylmer Fisher    Clothing Club         shirt  14.25
4         Kirstine Smith    Clothing Club         dress  20.15
5       Fred Frequentist   Groceries R Us     cucumbers   2.05
6         Kirstine Smith    Clothing Club         dress  20.15
7           Gertrude Cox  Petroleum Plaza   fizzy drink   1.90
8           Gertrude Cox      Burger Mart         fries   1.95
9   Ronald Aylmer Fisher    Clothing Club         shirt  14.25
10  Ronald Aylmer Fisher  Petroleum Plaza       carwash  13.25
11  Ronald Aylmer Fisher    Clothing Club         shirt  14.25
12        Kirstine Smith  Petroleum Plaza           gas  24.95
13      Fred Frequentist   Groceries R Us          eggs   6.50
14          Gertrude Cox  Petroleum Plaza           gas  24.95
15      Fred Frequentist   Groceries R Us          eggs   6.50
16  Ronald Aylmer Fisher   Groceries R Us          eggs   6.50
17      Fred Frequentist   Groceries R Us        cheese   5.00
Pengantar Data Science dengan Python

Memeriksa DataFrame

df.head()
print(df.head())
                 suspect         location          item  price
0         Kirstine Smith  Petroleum Plaza           gas  24.95
1       Fred Frequentist      Burger Mart         fries   1.95
2           Gertrude Cox      Burger Mart         fries   1.95
3   Ronald Aylmer Fisher    Clothing Club         shirt  14.25
4         Kirstine Smith    Clothing Club         dress  20.15
Pengantar Data Science dengan Python

Memeriksa DataFrame

df.info()
print(df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26 entries, 0 to 25
Data columns (total 3 columns):
 #   Column        Non-Null Count  Dtype  
 --  ------        --------------  -----  
 0   letter_index  26 non-null     int64 
 1   letter        26 non-null     object 
 2   frequency     26 non-null     float64 
dtypes: float64(1), int64(1), object(1)
memory usage: 704.0+ bytes
Pengantar Data Science dengan Python

Memeriksa DataFrame

Pengantar Data Science dengan Python

Ayo berlatih!

Pengantar Data Science dengan Python

Preparing Video For Download...