Profiling, versioning, and feature stores

MLOps Deployment and Life Cycling

Nemanja Radojkovic

Senior Machine Learning Engineer

Data profiling

 

 

Automated data analysis and creation of high-level summaries (a.k.a. data profiles, expectations), used for validating and monitoring data in production

Purpose of references 1

MLOps Deployment and Life Cycling

Purpose of references 1

MLOps Deployment and Life Cycling

Purpose of references 2

MLOps Deployment and Life Cycling

Purpose of references 3

MLOps Deployment and Life Cycling

Risk of NOT using data profiles

  • Clients complaining, although they submitted erroneous inputs to the model
  • No way to identify that data has drifted and our model is no longer valid
MLOps Deployment and Life Cycling

checklist profile

MLOps Deployment and Life Cycling

great expectations

MLOps Deployment and Life Cycling

checklist versioning

MLOps Deployment and Life Cycling

versioning arrow

MLOps Deployment and Life Cycling

ensure reproducibility

MLOps Deployment and Life Cycling

not a copy

MLOps Deployment and Life Cycling

data in place

MLOps Deployment and Life Cycling

just a pointer

MLOps Deployment and Life Cycling

version train test

MLOps Deployment and Life Cycling

 

dvc logo

 

 

 

  • Data
  • Version
  • Control
MLOps Deployment and Life Cycling

feature store

MLOps Deployment and Life Cycling

fs is a db

MLOps Deployment and Life Cycling

cross project reusability

MLOps Deployment and Life Cycling

dual db

MLOps Deployment and Life Cycling

high-volume db

MLOps Deployment and Life Cycling

single record db

MLOps Deployment and Life Cycling

reusability

MLOps Deployment and Life Cycling

train serve skew

MLOps Deployment and Life Cycling

emails

MLOps Deployment and Life Cycling

Let's practice!

MLOps Deployment and Life Cycling

Preparing Video For Download...