Completing the Machine Learning Loop

“One day, all software will learn” — but not today.

Image for post
Image for post
(Source: Image by Shutterstock and Daniel Jeffries)

Capability vs. Ability

The Data Science Process

Image for post
Image for post
Functional diagram of a machine learning system. Model development is typically an offline process which results in a trained model or inference pipeline to be incorporated into a production analytics system. Over time, data from the production system (typically a data lake) is pulled into the model development process to improve an analytic’s quality and/or performance. (Image by author)

Software Development: Two Life Cycles Diverge in the Woods

Image for post
Image for post
The Software Development Life Cycle (SDLC) is a useful construct to show the journey software must continually undergo. (Image by author)
Image for post
Image for post
Testing and monitoring required in traditional software systems (Image adapted from: The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction)

The Two Loops

Image for post
Image for post
The Two Loops: A model of what machine learning software development encompasses. The Code Loop is crucial to develop the ML software for model stability and efficiency, while the Data Loop is essential to improving model quality and maintaining the model’s relevance. Creating ML models requires the Code Loop and Data Loop to interact at various stages, such as model training and monitoring. (Image by author)

Data: The new source code

Data Bugs

Image for post
Image for post
Testing and monitoring required in machine learning systems (Image adapted from: The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction)

“Keeping the algorithm fixed and varying the training data was really, in my view, the most efficient way to build the system.” — Andrew Ng

MLOps: DevOps for Machine Learning

Image for post
Image for post
MLOps Principles and Key Considerations (updated/modified by author from Summary of MLOps Principles and Best Practices)

Practical MLOps: Pachyderm for data and code+data

Image for post
Image for post
Pachyderm architectural and user overview (Image by Pachyderm)
Image for post
Image for post
From blog post Pachyderm and the power of GitHub Actions: MLOps meets DevOps (Image by author)

There and Back Again

Written by

Solving real-world problems with Machine Learning and AI | ML Developer Advocate @ pachyderm.io | Computer Science @UniOfOxford | Published @SpringerCompSci

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store