Filtering arrays

Introduction to NumPy

Izzy Weber

Core Curriculum Manager, DataCamp

Two ways to filter

 

  1. Masks and fancy indexing
  2. np.where()
Introduction to NumPy

Boolean masks

one_to_five = np.arange(1, 6)
one_to_five
array([1, 2, 3, 4, 5])
mask = one_to_five % 2 == 0
mask
array([False, True, False, True, False])
Introduction to NumPy

Filtering with fancy indexing

one_to_five = np.arange(1, 6)
mask = one_to_five % 2 == 0
one_to_five[mask]
array([2, 4])
Introduction to NumPy

2D fancy indexing

classroom_ids_and_sizes = np.array([[1, 22], [2, 21], [3, 27], [4, 26]])
classroom_ids_and_sizes
array([[ 1, 22],
       [ 2, 21],
       [ 3, 27],
       [ 4, 26]])
classroom_ids_and_sizes[:, 1] % 2 == 0
Introduction to NumPy

2D fancy indexing

classroom_ids_and_sizes = np.array([[1, 22], [2, 21], [3, 27], [4, 26]])
classroom_ids_and_sizes
array([[ 1, 22],
       [ 2, 21],
       [ 3, 27],
       [ 4, 26]])
classroom_ids_and_sizes[:, 0][classroom_ids_and_sizes[:, 1] % 2 == 0]
array([1, 4])
Introduction to NumPy

Fancy indexing vs. np.where()

 

Fancy indexing

  • Returns array of elements

 

np.where()

  • Returns array of indices
  • Can create an array based on whether elements do or don't meet condition
Introduction to NumPy

Filtering with np.where()

classroom_ids_and_sizes
array([[ 1, 22],
       [ 2, 21],
       [ 3, 27],
       [ 4, 26]])
np.where(classroom_ids_and_sizes[:, 1] % 2 == 0)
(array([0, 3]),)
Introduction to NumPy

np.where() element retrieval

sudoku_game
array([[0, 0, 4, 3, 0, 0, 2, 0, 9],
       [0, 0, 5, 0, 0, 9, 0, 0, 1],
       [0, 7, 0, 0, 6, 0, 0, 4, 3],
       [0, 0, 6, 0, 0, 2, 0, 8, 7],
       [1, 9, 0, 0, 0, 7, 4, 0, 0],
       [0, 5, 0, 0, 8, 3, 0, 0, 0],
       [6, 0, 0, 0, 0, 0, 1, 0, 5],
       [0, 0, 3, 5, 0, 8, 6, 9, 0],
       [0, 4, 2, 9, 1, 0, 3, 0, 0]])
Introduction to NumPy

A tuple of indices

row_ind, column_ind = np.where(sudoku_game == 0)
row_ind, column_ind
(array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4,
        4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8,
        8, 8]),
 array([0, 1, 4, 5, 7, 0, 1, 3, 4, 6, 7, 0, 2, 3, 5, 6, 0, 1, 3, 4, 6, 2,
        3, 4, 7, 8, 0, 2, 3, 6, 7, 8, 1, 2, 3, 4, 5, 7, 0, 1, 4, 8, 0, 5,
        7, 8]))
Introduction to NumPy

Find and replace

np.where(sudoku_game == 0, "", sudoku_game)
array([['', '', '4', '3', '', '', '2', '', '9'],
       ['', '', '5', '', '', '9', '', '', '1'],
       ['', '7', '', '', '6', '', '', '4', '3'],
       ['', '', '6', '', '', '2', '', '8', '7'],
       ['1', '9', '', '', '', '7', '4', '', ''],
       ['', '5', '', '', '8', '3', '', '', ''],
       ['6', '', '', '', '', '', '1', '', '5'],
       ['', '', '3', '5', '', '8', '6', '9', ''],
       ['', '4', '2', '9', '1', '', '3', '', '']])
Introduction to NumPy

Let's practice!

Introduction to NumPy

Preparing Video For Download...