Contoh praktis

Pengantar Pengujian di Python

Alexander Levin

Data Scientist

Data dan pipeline

Data: gaji di data science.

Setiap baris berisi informasi pekerja data science dengan gaji, jabatan, dan atribut lain.

tabel gaji ds

Pipeline: untuk menghitung gaji rata-rata:

  1. Baca data
  2. Saring berdasarkan tipe pekerjaan
  3. Hitung rata-rata gaji
  4. Simpan hasil
Pengantar Pengujian di Python

Kode pipeline

import pandas as pd

# Fixture untuk membaca data
@pytest.fixture
def read_df():
    return pd.read_csv('ds_salaries.csv')
# Fungsi untuk menyaring data
def filter_df(df):
    return df[df['employment_type'] == 'FT']
# Fungsi untuk menghitung rata-rata
def get_mean(df):   
    return df['salary_in_usd'].mean()
Pengantar Pengujian di Python

Uji integrasi

Kasus uji:

  • Membaca data
  • Menulis ke file

Kode:

def test_read_df(read_df):
    # Check the type of the dataframe
    assert isinstance(read_df, pd.DataFrame)
    # Check that df contains rows
    assert read_df.shape[0] > 0
Pengantar Pengujian di Python

Uji integrasi

Contoh memeriksa bahwa Python dapat membuat file.

def test_write():
    # Membuka file dalam mode tulis
    with open('temp.txt', 'w') as wfile:
        # Menulis teks ke file
        wfile.write('Testing stuff is awesome')
    # Memeriksa file ada
    assert os.path.exists('temp.txt')
    # Jangan lupa bersihkan kembali
    os.remove('temp.txt')
Pengantar Pengujian di Python

Uji unit

Kasus uji:

  • Dataset tersaring hanya berisi tipe pekerjaan 'FT'
  • Fungsi get_mean() mengembalikan angka

Kode:

def test_units(read_df):
    filtered = filter_df(read_df)
    assert filtered['employment_type'].unique() == ['FT']
    assert isinstance(get_mean(filtered), float)
Pengantar Pengujian di Python

Uji fitur

Kasus uji:

  • Rata-rata lebih besar dari nol
  • Rata-rata tidak lebih besar dari gaji maksimum di dataset

Kode:

def test_feature(read_df):
    # Menyaring data
    filtered = filter_df(read_df)
    # Kasus uji: rata-rata > 0
    assert get_mean(filtered) > 0
    # Kasus uji: rata-rata ≤ maksimum
    assert get_mean(filtered) <= read_df['salary_in_usd'].max()
Pengantar Pengujian di Python

Uji performa

Kasus uji:

  • Waktu eksekusi pipeline dari awal hingga akhir

Kode:

def test_performance(benchmark, read_df):
    # Dekorator benchmark
    @benchmark
    # Fungsi yang diukur
    def get_result():
        filtered = filter_df(read_df)
        return get_mean(filtered)
Pengantar Pengujian di Python

Paket uji final

import pytest

## Integration Tests
def test_read_df(read_df):
      # Check the type of the dataframe
    assert isinstance(read_df, pd.DataFrame)
    # Check that df contains rows
    assert read_df.shape[0] > 0
def test_write():
    with open('temp.txt', 'w') as wfile:
        wfile.write('12345')
    assert os.path.exists('temp.txt')
    os.remove('temp.txt')

## Unit Tests
def test_units(read_df):
    filtered = filter_df(read_df)
    assert filtered['employment_type'].unique() == ['FT']
    assert isinstance(get_mean(filtered), float)
## Feature Tests
def test_feature(read_df):
    # Filtering the data
    filtered = filter_df(read_df)
    # Test case: mean is greater than zero
    assert get_mean(filtered) > 0
    # Test case: mean is not bigger than the maximum
    assert get_mean(filtered) <= read_df['salary_in_usd'].max()

## Performance Tests
def test_performance(benchmark, read_df):
    # Benchmark decorator
    @benchmark
    # Function to measure
    def pipeline():
        filtered = filter_df(read_df)
        return get_mean(filtered)
Pengantar Pengujian di Python

Ayo berlatih!

Pengantar Pengujian di Python

Preparing Video For Download...