Data preparation

Introduction to Alteryx

Iason Prassides

Content Developer, DataCamp

What is data preparation?

  • Important step in data analysis
  • Involves cleaning, transforming, and organizing raw data
  • Clean data results in more accurate analysis

Image showing the cleaning, transforming, and organizing of data

Introduction to Alteryx

What is data preparation?

  • Remove missing values, typos, and duplicate entries
  • Ensure data is relevant
  • Columns use the correct data types and helpful names
  • Preparing work early leads to effective decision-making

Data cleaning image

Introduction to Alteryx

Cartoon image of a school library.png

Introduction to Alteryx

DC High School library

  • Classify new books and journals by author and genre
  • Tag and catalog items for easy organization and identification
  • Sort and display books in the appropriate sections

Image of a librarian organizing books in a library

Introduction to Alteryx

The tools

Horizontal tool palette.png

Introduction to Alteryx

The tools

Preparation toolset available

Transparency change to image.png

Introduction to Alteryx

The tools

Preparation toolset available

  • Select tool
  • Sort tool
  • Sample tool

The three preparation tools.png

Introduction to Alteryx

Data types in Alteryx Designer

  • Important to select the correct data type for columns
    • Ensures accurate analysis and calculations
    • Allows for improved profiling and efficiency

Image of three data types

Introduction to Alteryx

Data types in Alteryx Designer

  • Main data type categories in Alteryx Designer
    • Boolean,

 

$$

Data type images with only Boolean text showing

Introduction to Alteryx

Data types in Alteryx Designer

  • Main data type categories in Alteryx
    • Boolean, Numeric

 

 

Data type images with boolean and numeric text

Introduction to Alteryx

Data types in Alteryx Designer

  • Main data type categories in Alteryx
    • Boolean, Numeric, String

 

 

Data types with Boolean, Numeric, and String showing.png

Introduction to Alteryx

Data types in Alteryx Designer

  • Main data type categories in Alteryx
    • Boolean, Numeric, String, DateTime

 

 

Data types with text for Boolean Numeric String DateTime.png

Introduction to Alteryx

Data types in Alteryx Designer

  • Main data type categories in Alteryx
    • Boolean, Numeric, String, DateTime, Spatial

 

 

All five data types used in Alteryx

Introduction to Alteryx

Data types in Alteryx Designer

  • DC High School dataset
    • Only contains text and numbers

 

 

Focus on Numeric and String data types.png

Introduction to Alteryx

Data types in Alteryx Designer

  • Numeric data types:
    • Byte, Integer, Fixed Decimal, Double
  • Accurately store whole and decimal numbers
  • Bytes represent integers 0-255
  • Example:
    • Test results out of 100 would use Byte type

 

Image showing the numeric Data types.png

Introduction to Alteryx

Data types in Alteryx Designer

  • String types represent text sequences
  • Alteryx Designer string classifications:
    • String, V_String, V_WString
  • V_String - variable text lengths
    • Short to very large
  • Strings can contain letters, numbers, symbols, and spaces

 

Image showing the string data type.png

Introduction to Alteryx

The tools

Select tool

  • Used to include or exclude columns
  • Apply the appropriate column data types
  • Change column names
  • Add descriptions for each column

Select tool image.png

Introduction to Alteryx

The tools

Sort tool

  • Sort the dataset by the columns within
  • Choose by ascending or descending order

sort tool image.png

Introduction to Alteryx

The tools

Sample tool

  • Create a sample of the dataset
  • Control the method the sample is created and how big it is

sample tool image.png

Introduction to Alteryx

Let's practice!

Introduction to Alteryx

Preparing Video For Download...