Filtering and sorting

Data Analysis in Google Sheets

James Chapman

Curriculum Manager, DataCamp

Exploring data

 

 

The exploring data step highlighted alongside the other stages of the data analysis process.

Data Analysis in Google Sheets

Finding unique values

  • Categorical data: can only be one of a finite number of values
  • E.g., countries and blood groups
  • UNIQUE() → find the number of unique values

A column containing country data.

Data Analysis in Google Sheets

Finding unique values

 

Syntax:

  • UNIQUE(range)
    • range - the cells to search across

Starting the formula with the UNIQUE function name.

Data Analysis in Google Sheets

Finding unique values

 

Syntax:

  • UNIQUE(range)
    • range - the cells to search across

Starting the cell range with A2.

Data Analysis in Google Sheets

Finding unique values

 

Syntax:

  • UNIQUE(range)
    • range - the cells to search across

The cell range, A2-colon-A has been passed to the UNIQUE function.

Data Analysis in Google Sheets

Finding unique values

 

Syntax:

  • UNIQUE(range)
    • range - the cells to search across

The UNIQUE function returned 17 unique countries.

Data Analysis in Google Sheets

Filtering

  • Extract subsets of the dataset for more detailed exploration

A spreadsheet containing data from various kickstarter games.

Data Analysis in Google Sheets

FILTER()

Syntax:

FILTER(range, condition1, [condition2, ...])
  • range: range to be filtered
  • condition1: condition to filter on

Starting the formula with the FILTER function.

Data Analysis in Google Sheets

FILTER()

Syntax:

FILTER(range, condition1, [condition2, ...])
  • range: range to be filtered
  • condition1: condition to filter on

The range to filter, A2-colon-A, has been passed to the FILTER function.

Data Analysis in Google Sheets

FILTER()

Syntax:

FILTER(range, condition1, [condition2, ...])
  • range: range to be filtered
  • condition1: condition to filter on

The condition starts with the range containing the Backer values, B2-colon-B.

Data Analysis in Google Sheets

FILTER()

Syntax:

FILTER(range, condition1, [condition2, ...])
  • range: range to be filtered
  • condition1: condition to filter on

The greater than operator is used to define the criteria.

Data Analysis in Google Sheets

FILTER()

Syntax:

FILTER(range, condition1, [condition2, ...])
  • range: range to be filtered
  • condition1: condition to filter on

The criteria is completed by specifying the value, 1000.

Data Analysis in Google Sheets

FILTER()

Syntax:

FILTER(range, condition1, [condition2, ...])
  • range: range to be filtered
  • condition1: condition to filter on

The function returns the games with more than 1000 backers.

Data Analysis in Google Sheets

Sorting

 

  • Identify largest and smallest values
  • Useful for identifying potential outliers

Sorting icon.

Data Analysis in Google Sheets

SORT()

Syntax:

SORT(range, sort_column, is_ascending, ...)
  • range: the cells to sort
  • sort_column: the column to base the sorting on
  • is_ascending (TRUE/FALSE): the sorting order
Data Analysis in Google Sheets

Sorting one column

Syntax:

SORT(range, sort_column, is_ascending, ...)

The Backers column is sorted by passing the range B2-colon-B to the SORT function.

Data Analysis in Google Sheets

Sorting one column

Syntax:

SORT(range, sort_column, is_ascending, ...)

The SORT function sorted the Backers column in ascending order by default.

Data Analysis in Google Sheets

Sorting multiple columns

Syntax:

SORT(range, sort_column, is_ascending, ...)

The kickstarter dataset with two new columns, Name by Backers and Backers Sorted, which will contains the name and Backers data sorted by the Backers values.

Data Analysis in Google Sheets

Sorting multiple columns

Syntax:

SORT(range, sort_column, is_ascending, ...)

The first argument passed to SORT is the range containing both the name and Backers column, A2-colon-B.

Data Analysis in Google Sheets

Sorting multiple columns

Syntax:

SORT(range, sort_column, is_ascending, ...)

The second argument to SORT, the column to use for sorting, is specified as two, the second column in the specified range.

Data Analysis in Google Sheets

Sorting multiple columns

Syntax:

SORT(range, sort_column, is_ascending, ...)

The third argument passed to SORT is FALSE to sort in descending order.

Data Analysis in Google Sheets

Sorting multiple columns

Syntax:

SORT(range, sort_column, is_ascending, ...)

The function returns the names and Backers data sorted by the Backers values in descending order.

Data Analysis in Google Sheets

Let's practice!

Data Analysis in Google Sheets

Preparing Video For Download...