Clustering

Statistical Techniques in Tableau

Maarten Van den Broeck

Content Developer at DataCamp

Supervised vs. unsupervised machine learning

Supervised learning

  • Apply known relationship between variables on new, unseen data
  • E.g. regression, exponential smoothing

Unsupervised learning

  • Looks for similar data points and detects patterns
  • E.g. clustering
Statistical Techniques in Tableau

k-means clustering

Visual representation of the k-means clustering algorithm.

Statistical Techniques in Tableau

k-means clustering

Visual representation of the k-means clustering algorithm. Two random centers are selected.

Statistical Techniques in Tableau

k-means clustering

Visual representation of the k-means clustering algorithm. The closest points are assigned to those centers.

Statistical Techniques in Tableau

k-means clustering

Visual representation of the k-means clustering algorithm. The centers are moved to the new center.

Statistical Techniques in Tableau

k-means clustering

Visual representation of the k-means clustering algorithm. The process is iterative and repeats.

Statistical Techniques in Tableau

k-means clustering

Visual representation of the k-means clustering algorithm. The algorithm stops when the centers stop moving.

Statistical Techniques in Tableau

Assess clustering quality

Between-group sum of squares

Visual representation of k-means clustering. Between-group sum of squares is the sum of the squared distances between the centers and the mean of the whole dataset.

  • The higher, the better

Within-group sum of squares

Visual representation of k-means clustering. Within-group sum of squares is the sum of the squared distances between the centers and the points of the cluster.

  • The lower, the better
Statistical Techniques in Tableau

Let's practice!

Statistical Techniques in Tableau

Preparing Video For Download...