Completing the Machine Learning Loop

“One day, all software will learn” — but not today.

(Source: Image by Shutterstock and Daniel Jeffries)

In other words, it’s all about the iteration.

Capability vs. Ability

The Data Science Process

Functional diagram of a machine learning system. Model development is typically an offline process which results in a trained model or inference pipeline to be incorporated into a production analytics system. Over time, data from the production system (typically a data lake) is pulled into the model development process to improve an analytic’s quality and/or performance. (Image by author)

Software Development: Two Life Cycles Diverge in the Woods

The Software Development Life Cycle (SDLC) is a useful construct to show the journey software must continually undergo. (Image by author)
Testing and monitoring required in traditional software systems (Image adapted from: The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction)

The Two Loops

The Two Loops: A model of what machine learning software development encompasses. The Code Loop is crucial to develop the ML software for model stability and efficiency, while the Data Loop is essential to improving model quality and maintaining the model’s relevance. Creating ML models requires the Code Loop and Data Loop to interact at various stages, such as model training and monitoring. (Image by author)

Data: The new source code

Data Bugs

Biased data is buggy data — if it’s unreliable or doesn’t represent the domain then it’s a bug.

Testing and monitoring required in machine learning systems (Image adapted from: The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction)

“Keeping the algorithm fixed and varying the training data was really, in my view, the most efficient way to build the system.” — Andrew Ng

MLOps: DevOps for Machine Learning

MLOps Principles and Key Considerations (updated/modified by author from Summary of MLOps Principles and Best Practices)

Practical MLOps: Pachyderm for data and code+data

Pachyderm architectural and user overview (Image by Pachyderm)
From blog post Pachyderm and the power of GitHub Actions: MLOps meets DevOps (Image by author)

There and Back Again

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jimmy Whitaker

Applying AI the right way | Chief Scientist — AI @pachyderm.io | Computer Science @UniOfOxford | Published @SpringerCompSci