Information and feature importance

Dimensionality Reduction in R

Matt Pickard

Owner, Pickard Predictives, LLC

Quote on information gain

¹ Provost, Foster; Fawcett, Tom (2013-07-27). Data Science for Business: What you need to know about data mining and data-analytic thinking. O'Reilly Media. Kindle Edition.

Feature importance

Feature importance: a measure of information in model building

Predictor target model illustration

Many ways to measure feature importance

Correlation (with target variable)
Standardize regression coefficients
Information gain

Decision tree example

A set of observations of loan defaults with characteristics of shape, color, outline, and texture

Decision tree and information gain

Information gain - the amount of information we know about one variable by observing another variable

Information gain equation

set being split by some feature

Entropy

A measure of disorder
As purity goes up, entropy goes down
Entropy values range from 0 (perfect purity) to 1 (perfect entropy)

Entropy graph

Entropy: root node

Entropy equation

p_yes <- 7/16

p_no <- 9/16

entropy_root <- 
  -(p_yes * log2(p_yes)) + 
  -(p_no * log2(p_no))

entropy_root

0.989

Image of observations in root node