Mastering test execution

Unit Testing for Data Science in Python

Dibya Chakravorty

Test Automation Engineer

Test organization

Unit Testing for Data Science in Python

Test organization

Unit Testing for Data Science in Python

Test organization

Unit Testing for Data Science in Python

Test organization

Unit Testing for Data Science in Python

Running all tests

Unit Testing for Data Science in Python

Running all tests

cd tests

pytest
  • Recurses into directory subtree of tests/.
    • Filenames starting with test_ $\rightarrow$ test module.
      • Classnames starting with Test $\rightarrow$ test class.
        • Function names starting with test_ $\rightarrow$ unit test.
Unit Testing for Data Science in Python

Running all tests

================================================================== test session starts =========================================
data/test_preprocessing_helpers.py ........F....                                                                          [ 81%]
features/test_as_numpy.py .                                                                                               [ 87%]
models/test_train.py ..                                                                                                   [100%]

======================================================================== FAILURES ==============================================
____________________________________________________ TestRowToList.test_on_one_tab_with_missing_value __________________________

self = <tests.data.test_preprocessing_helpers.TestRowToList object at 0x7f6205475240>

    def test_on_one_tab_with_missing_value(self):    # (1, 1) boundary value
        actual = row_to_list("\t4,567\n")
>       assert actual is None, "Expected: None, Actual: {0}".format(actual)
E       AssertionError: Expected: None, Actual: ['', '4,567']
E       assert ['', '4,567'] is None

data/test_preprocessing_helpers.py:55: AssertionError
========================================================== 1 failed, 15 passed in 0.46 seconds ================================
Unit Testing for Data Science in Python

Typical scenario: CI server

Unit Testing for Data Science in Python

Binary question: do all unit tests pass?

Unit Testing for Data Science in Python

The -x flag: stop after first failure

pytest -x
========================================= test session starts =========================================
data/test_preprocessing_helpers.py ........F

============================================== FAILURES ===============================================
__________________________ TestRowToList.test_on_one_tab_with_missing_value ___________________________

self = <tests.data.test_preprocessing_helpers.TestRowToList object at 0x7f6309f17198>

    def test_on_one_tab_with_missing_value(self):    # (1, 1) boundary value
        actual = row_to_list("\t4,567\n")
>       assert actual is None, "Expected: None, Actual: {0}".format(actual)
E       AssertionError: Expected: None, Actual: ['', '4,567']
E       assert ['', '4,567'] is None

data/test_preprocessing_helpers.py:55: AssertionError
================================= 1 failed, 8 passed in 0.45 seconds ==================================
Unit Testing for Data Science in Python

Running tests in a test module

Unit Testing for Data Science in Python

Running tests in a test module

pytest data/test_preprocessing_helpers.py
data/test_preprocessing_helpers.py ........F....                                                [100%]

============================================== FAILURES ===============================================
__________________________ TestRowToList.test_on_one_tab_with_missing_value ___________________________

self = <tests.data.test_preprocessing_helpers.TestRowToList object at 0x7f435947f198>

    def test_on_one_tab_with_missing_value(self):    # (1, 1) boundary value
        actual = row_to_list("\t4,567\n")
>       assert actual is None, "Expected: None, Actual: {0}".format(actual)
E       AssertionError: Expected: None, Actual: ['', '4,567']
E       assert ['', '4,567'] is None

data/test_preprocessing_helpers.py:55: AssertionError
================================= 1 failed, 12 passed in 0.07 seconds =================================
Unit Testing for Data Science in Python

Running only a particular test class

Unit Testing for Data Science in Python

Node ID

  • Node ID of a test class: <path to test module>::<test class name>
  • Node ID of an unit test: <path to test module>::<test class name>::<unit test name>
Unit Testing for Data Science in Python

Running tests using node ID

  • Run the test class TestRowToList.
pytest data/test_preprocessing_helpers.py::TestRowToList
data/test_preprocessing_helpers.py ..F....                                                      [100%]

============================================== FAILURES ===============================================
__________________________ TestRowToList.test_on_one_tab_with_missing_value ___________________________

self = <tests.data.test_preprocessing_helpers.TestRowToList object at 0x7ffb3bac4da0>

    def test_on_one_tab_with_missing_value(self):    # (1, 1) boundary value
        actual = row_to_list("\t4,567\n")
>       assert actual is None, "Expected: None, Actual: {0}".format(actual)
E       AssertionError: Expected: None, Actual: ['', '4,567']
E       assert ['', '4,567'] is None

data/test_preprocessing_helpers.py:55: AssertionError
================================= 1 failed, 6 passed in 0.06 seconds ==================================
Unit Testing for Data Science in Python

Running tests using node ID

  • Run the unit test test_on_one_tab_with_missing_value().
pytest data/test_preprocessing_helpers.py::TestRowToList::test_on_one_tab_with_missing_value
data/test_preprocessing_helpers.py F                                                            [100%]

============================================== FAILURES ===============================================
__________________________ TestRowToList.test_on_one_tab_with_missing_value ___________________________

self = <tests.data.test_preprocessing_helpers.TestRowToList object at 0x7f4eece33b00>

    def test_on_one_tab_with_missing_value(self):    # (1, 1) boundary value
        actual = row_to_list("\t4,567\n")
>       assert actual is None, "Expected: None, Actual: {0}".format(actual)
E       AssertionError: Expected: None, Actual: ['', '4,567']
E       assert ['', '4,567'] is None

data/test_preprocessing_helpers.py:55: AssertionError
====================================== 1 failed in 0.06 seconds =======================================
Unit Testing for Data Science in Python

Running tests using keyword expressions

Unit Testing for Data Science in Python

The -k option

pytest -k "pattern"
  • Runs all tests whose node ID matches the pattern.
Unit Testing for Data Science in Python

The -k option

  • Run the test class TestSplitIntoTrainingAndTestingSets.
pytest -k "TestSplitIntoTrainingAndTestingSets"
models/test_train.py ..                                                                         [100%]

=============================== 2 passed, 14 deselected in 0.36 seconds ===============================
pytest -k "TestSplit"
models/test_train.py ..                                                                         [100%]

=============================== 2 passed, 14 deselected in 0.36 seconds ===============================
Unit Testing for Data Science in Python

Supports Python logical operators

pytest -k "TestSplit and not test_on_one_row"
models/test_train.py .                                                      [100%]

==================== 1 passed, 15 deselected in 0.36 seconds ====================
Unit Testing for Data Science in Python

Let's run some tests!

Unit Testing for Data Science in Python

Preparing Video For Download...