Profiling, versioning, and feature stores

Deployment e ciclo di vita in MLOps

Nemanja Radojkovic

Senior Machine Learning Engineer

Data profiling

 

 

Automated data analysis and creation of high-level summaries (a.k.a. data profiles, expectations), used for validating and monitoring data in production

Purpose of references 1

Deployment e ciclo di vita in MLOps

Purpose of references 1

Deployment e ciclo di vita in MLOps

Purpose of references 2

Deployment e ciclo di vita in MLOps

Purpose of references 3

Deployment e ciclo di vita in MLOps

Risk of NOT using data profiles

  • Clients complaining, although they submitted erroneous inputs to the model
  • No way to identify that data has drifted and our model is no longer valid
Deployment e ciclo di vita in MLOps

checklist profile

Deployment e ciclo di vita in MLOps

great expectations

Deployment e ciclo di vita in MLOps

checklist versioning

Deployment e ciclo di vita in MLOps

versioning arrow

Deployment e ciclo di vita in MLOps

ensure reproducibility

Deployment e ciclo di vita in MLOps

not a copy

Deployment e ciclo di vita in MLOps

data in place

Deployment e ciclo di vita in MLOps

just a pointer

Deployment e ciclo di vita in MLOps

version train test

Deployment e ciclo di vita in MLOps

 

dvc logo

 

 

 

  • Data
  • Version
  • Control
Deployment e ciclo di vita in MLOps

feature store

Deployment e ciclo di vita in MLOps

fs is a db

Deployment e ciclo di vita in MLOps

cross project reusability

Deployment e ciclo di vita in MLOps

dual db

Deployment e ciclo di vita in MLOps

high-volume db

Deployment e ciclo di vita in MLOps

single record db

Deployment e ciclo di vita in MLOps

reusability

Deployment e ciclo di vita in MLOps

train serve skew

Deployment e ciclo di vita in MLOps

emails

Deployment e ciclo di vita in MLOps

Let's practice!

Deployment e ciclo di vita in MLOps

Preparing Video For Download...