Introduction to GitHub Actions

CI/CD for Machine Learning

Ravi Bhadauria

Machine Learning Engineer

What is GitHub Actions?

  • GitHub Actions (GHA): CI/CD platform to automate pipelines
  • Pipeline: a sequence of steps that represent the flow of work and data
CI/CD for Machine Learning

What is GitHub Actions?

An image of a car assembly line with workers stationed at different stages, each performing specific tasks like attaching the engine, installing the wheels, and painting the body.

CI/CD for Machine Learning

What is GitHub Actions?

Github Actions create a series of defined actions or steps for their software project, such as checking out repository, building the code, running tests, or deploying the application.

1 https://medium.com/empathyco/applying-ci-cd-using-github-actions-for-android-1231e40cc52f
CI/CD for Machine Learning

GHA Components: Event

  • Event: is a specific activity in a repository that triggers a workflow run
    • Push
    • Pull Request
    • Opening an issue
CI/CD for Machine Learning

GHA Components: Workflow

  • Workflow: automated process that will run one or more jobs
    • Defined in YAML files
    • Triggered automatically by event
      • Manual run possible
    • Housed in .github/workflows directory in the repository
    • Multiple workflows can be defined
CI/CD for Machine Learning

GHA Components: Steps and Actions

  • Steps: individual units of work
    • Executed in order, depends on previous step
    • Run on the same machine, so data can be shared
    • Unit of work examples
      • Compiled code application, shell script
      • Action: GHA platform specific application
        • E.g. checkout repo, comment on PR
CI/CD for Machine Learning

GHA Components: Jobs and Runners

  • Job: set of steps
    • Each job is independent
    • Parallel execution is possible
    • Executed on the compute machine called runners
CI/CD for Machine Learning

A simple GHA workflow

Image of a a push event in a repository that triggered a workflow where a single job runs on an Ubuntu Linux runner machine. The job contains two steps checking out the repo and running Python code.

  • Event: Push
  • Job: runs on Ubuntu runner, has two steps
    • Action: Checkout Repo
    • Run Python Code
CI/CD for Machine Learning

Putting it all together

Image of a typical GitHub Action workflow components outlining parallel job and sequential step execution.

CI/CD for Machine Learning

Let's practice!

CI/CD for Machine Learning

Preparing Video For Download...