Mengimpor flat file dengan NumPy

Pengantar Mengimpor Data di Python

Hugo Bowne-Anderson

Data Scientist at DataCamp

Mengapa NumPy?

  • Array NumPy: standar untuk menyimpan data numerik

 

ch_1_3.003.png

Pengantar Mengimpor Data di Python

Mengapa NumPy?

  • Array NumPy: standar untuk menyimpan data numerik
  • Esensial untuk paket lain: mis. scikit-learn ch_1_3.004.png
  • loadtxt()
  • genfromtxt()
Pengantar Mengimpor Data di Python

Mengimpor flat file dengan NumPy

import numpy as np
filename = 'MNIST.txt'
data = np.loadtxt(filename, delimiter=',')
data
[[   0.    0.    0.    0.    0.]
 [  86.  250.  254.  254.  254.]
 [   0.    0.    0.    9.  254.]
 ..., 
 [   0.    0.    0.    0.    0.]
 [   0.    0.    0.    0.    0.]
 [   0.    0.    0.    0.    0.]]
Pengantar Mengimpor Data di Python

Kustomisasi impor NumPy Anda

import numpy as np
filename = 'MNIST_header.txt'
data = np.loadtxt(filename, delimiter=',', skiprows=1)
print(data)
[[   0.    0.    0.    0.    0.]
 [  86.  250.  254.  254.  254.]
 [   0.    0.    0.    9.  254.]
 ..., 
 [   0.    0.    0.    0.    0.]
 [   0.    0.    0.    0.    0.]
 [   0.    0.    0.    0.    0.]]
  • skiprows: berapa baris (bukan indeks) yang ingin dilewati
Pengantar Mengimpor Data di Python

Kustomisasi impor NumPy Anda

import numpy as np
filename = 'MNIST_header.txt'
data = np.loadtxt(filename, delimiter=',', skiprows=1, usecols=[0, 2])
print(data)
[[   0.    0.]
 [  86.  254.]
 [   0.    0.]
 ..., 
 [   0.    0.]
 [   0.    0.]
 [   0.    0.]]
  • usecols: daftar indeks kolom yang ingin disimpan
Pengantar Mengimpor Data di Python

Kustomisasi impor NumPy Anda

data = np.loadtxt(filename, delimiter=',', dtype=str)
Pengantar Mengimpor Data di Python

Tipe data campuran

titanic.csv

                        Name      Sex  Cabin   Fare
     Braund, Mr. Owen Harris     male    NaN    7.3
  Cumings, Mrs. John Bradley   female    C85   71.3
      Heikkinen, Miss. Laina   female    NaN    8.0
Futrelle, Mrs. Jacques Heath   female   C123   53.1
    Allen, Mr. William Henry     male    NaN   8.05


1 Sumber: Kaggle
Pengantar Mengimpor Data di Python

Tipe data campuran

titanic.csv

                        Name      Sex  Cabin   Fare
     Braund, Mr. Owen Harris     male    NaN    7.3
  Cumings, Mrs. John Bradley   female    C85   71.3
      Heikkinen, Miss. Laina   female    NaN    8.0
Futrelle, Mrs. Jacques Heath   female   C123   53.1
    Allen, Mr. William Henry     male    NaN   8.05
               ^                                 ^
            strings                           floats
1 Sumber: Kaggle
Pengantar Mengimpor Data di Python

Ayo berlatih!

Pengantar Mengimpor Data di Python

Preparing Video For Download...