Pemasukan Data yang Efisien dengan pandas
Amany Mahfouz
Instructor
concat()pandaspd.concat([df1,df2])ignore_index ke True untuk menomori ulang baris# Ambil 20 hasil pertama untuk toko buku
params = {"term": "bookstore",
"location": "San Francisco"}
first_results = requests.get(api_url,
headers=headers,
params=params).json()
first_20_bookstores = json_normalize(first_results["businesses"],
sep="_")
print(first_20_bookstores.shape)
(20, 24)
# Ambil 20 toko buku berikutnya
params["offset"] = 20
next_results = requests.get(api_url,
headers=headers,
params=params).json()
next_20_bookstores = json_normalize(next_results["businesses"],
sep="_")
print(next_20_bookstores.shape)
(20, 24)
# Gabungkan dataset toko buku, nomor ulang baris bookstores = pd.concat([first_20_bookstores, next_20_bookstores], ignore_index=True)print(bookstores.name)
0 City Lights Bookstore
1 Alexander Book Company
2 Borderlands Books
3 Alley Cat Books
4 Dog Eared Books
... ...
35 Forest Books
36 San Francisco Center For The Book
37 KingSpoke - Book Store
38 Eastwind Books & Arts
39 My Favorite
Name: name, dtype: object
merge(): versi pandas dari join SQLmerge()pandas dan juga metode dataframedf.merge()on jika nama sama di kedua dataframeleft_on dan right_on jika nama kunci berbedacall_counts.head()
created_date call_counts
0 01/01/2018 4597
1 01/02/2018 4362
2 01/03/2018 3045
3 01/04/2018 3374
4 01/05/2018 4333
weather.head()
date tmax tmin
0 12/01/2017 52 42
1 12/02/2017 48 39
2 12/03/2017 48 42
3 12/04/2017 51 40
4 12/05/2017 61 50
# Gabungkan weather ke call_counts pada kolom tanggal merged = call_counts.merge(weather, left_on="created_date", right_on="date")print(merged.head())
created_date call_counts date tmax tmin
0 01/01/2018 4597 01/01/2018 19 7
1 01/02/2018 4362 01/02/2018 26 13
2 01/03/2018 3045 01/03/2018 30 16
3 01/04/2018 3374 01/04/2018 29 19
4 01/05/2018 4333 01/05/2018 19 9
created_date call_counts date tmax tmin
0 01/01/2018 4597 01/01/2018 19 7
1 01/02/2018 4362 01/02/2018 26 13
2 01/03/2018 3045 01/03/2018 30 16
3 01/04/2018 3374 01/04/2018 29 19
4 01/05/2018 4333 01/05/2018 19 9
merge(): hanya nilai yang ada di kedua datasetPemasukan Data yang Efisien dengan pandas