Writing maintainable ML code

Developing Machine Learning Models for Production

Sinan Ozdemir

Data Scientist, Entrepreneur, and Author

Project structuring

  • Organize project files into a logical structure
  • Group related files together
    • Organize data sets and ML models in separate folders
  • Ensure files are properly named and labeled

structure

Developing Machine Learning Models for Production

Sample project structure

Sample project directory with a README file, requirements file and three subfolders: data, models, notebooks mlops_directory

  • README.md: Explains the purpose of the repository and how to use it.
  • requirements.txt: Lists all dependencies
  • data: contains data-related files, including raw data and processed data
  • models: contains all model-related files, including scripts for creating models.
  • notebooks: contains notebooks for data exploration, model training, and model evaluation.
Developing Machine Learning Models for Production

Code versioning

  • Use a version control system like git to keep track of changes to the code
  • Allows for rollback of changes if necessary
  • Can help identify the source of bugs and errors
  • Allows for parallel work

code

Developing Machine Learning Models for Production

Documentation

  • Document code and project structure
  • Explain the purpose of each file and function
  • Describe how to use the code
  • Include instructions on how to deploy the ML model

doc

Developing Machine Learning Models for Production

Adaptability of code

  • Easier to understand, modify, and update
  • Reduces the time and effort required to make changes to the codebase
  • More easily adapt to data + code + requirements changes
  • Less prone to bugs
  • Easier to integrate new features or technologies as needed
  • Essential for building ML applications that can evolve and adapt over time
Developing Machine Learning Models for Production

Let's practice!

Developing Machine Learning Models for Production

Preparing Video For Download...