Special topics in Machine Learning

Data Science for Business

Ramnath Vaidyanathan

VP of Product Research, DataCamp

Time series forecasting

time-series

  • Time is a feature
  • Accounts for weekly, monthly, or yearly trends
Data Science for Business

Seasonality

  • Weekly: Lower television viewership on Fridays
  • Monthly: Higher spending at end of pay periods
  • Yearly: Less ice cream in the winter

icecream.jpeg

Data Science for Business

Natural Language Processing

  • Dataset is text
    • Customer reviews
    • Tweets
    • Medical records
    • Email subjects
  • Possible uses
    • Classifying sentiment
    • Clustering medical records

customer-reviews.png

Data Science for Business

Word counts

Sentence Texans Giants football great
The Texans are a great football team. 1 0 1 1
The Giants are a great football team. 0 1 1 1
Data Science for Business

Problems with word counts: negation

Sentence Texans Giants football great not
The Giants are a great football team. 0 1 1 1 0
The Giants are not a great football team. 0 1 1 1 1
Data Science for Business

Word counts and synonyms

  • Word counts don't help us consider synonyms
  • Example: "blue"
    • "sky-blue"
    • "aqua"
    • "cerulean"
  • Want to group as a single feature

blue

Data Science for Business

Word embeddings

  • Create features that group similar words
  • Features have a mathematical meaning:
king - man + woman = queen
Data Science for Business

Review

  • Time series forecasting
    • Time is a feature
    • Seasonality
  • Natural Language Processing (NLP)
    • Text as input data
    • Word counts
    • Word embeddings
Data Science for Business

Let's practice!

Data Science for Business

Preparing Video For Download...