Completing the Machine Learning Loop

“One day, all software will learn” — but not today.

(Source: Image by Shutterstock and Daniel Jeffries)

Capability vs. Ability

In the field of AI, you’ve got to separate capability and ability. For instance, I may have the capability of becoming a decent piano player, but my ability to play “La Campanella” is limited by the amount of time I spend practicing.

The Data Science Process

Functional diagram of a machine learning system. Model development is typically an offline process which results in a trained model or inference pipeline to be incorporated into a production analytics system. Over time, data from the production system (typically a data lake) is pulled into the model development process to improve an analytic’s quality and/or performance. (Image by author)

Software Development: Two Life Cycles Diverge in the Woods

At first glance the Machine Learning Loop that we’re describing sounds very similar to the software development life cycle (SDLC) — a logical process that aims to produce high-quality software through a series of development stages. Completing the loop is done by incorporating feedback from a new feature into the planning of future iterations.

The Software Development Life Cycle (SDLC) is a useful construct to show the journey software must continually undergo. (Image by author)
  1. CI/CD — automated testing on code changes, remove manual overhead
  2. Agile software development — short release cycles, incorporate feedback, emphasize team collaboration
  3. Continuous monitoring — ensure visibility into the performance and health of the application, alerts for undesired conditions
  4. Infrastructure as code — automate dependable deployments
Testing and monitoring required in traditional software systems (Image adapted from: The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction)
  1. Code is then integration tested. That’s where different software pieces are tested together to ensure the full system functions as expected.
  2. The system is then deployed and monitored, collecting information on user experience, resource utilization, errors, alerts and notifications that inform future development.

The Two Loops

Machine learning development is not the same as software development. The most important difference is that we are now dealing with two moving pieces: Code and Data.

The Two Loops: A model of what machine learning software development encompasses. The Code Loop is crucial to develop the ML software for model stability and efficiency, while the Data Loop is essential to improving model quality and maintaining the model’s relevance. Creating ML models requires the Code Loop and Data Loop to interact at various stages, such as model training and monitoring. (Image by author)

Data: The new source code

ML code, just like in traditional software development needs to be versioned, reliable, and tested. However, although it requires similar attention, data is quite different from code:

  • Data grows and changes frequently
  • Data has many types and forms (video, audio, text, tabular, etc.)
  • Data has privacy challenges, beyond managing secrets (GDPR, CCPA, HIPAA, etc.)
  • Data can exist in various stages (raw, processed, filtered, feature-form, model artifacts, metadata, etc.)

Data Bugs

With ML code, logic is no longer manually coded, but learned. The subtle thing about this new training stage is that it actually makes the quality of our data just as important as the quality of our code. That also means that bugs in our model can be the result of either bugs in code or in data.

Testing and monitoring required in machine learning systems (Image adapted from: The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction)

“Keeping the algorithm fixed and varying the training data was really, in my view, the most efficient way to build the system.” — Andrew Ng

This is the “what” we need to bring to ML development, but how do we do it? What tools will enable us to do it right?

MLOps: DevOps for Machine Learning

Similar to DevOps, MLOps is the developer-centric approach to managing machine learning software. DevOps for a Software 2.0 world. It focuses on the proper way to manage the full life cycle of machine learning development and the tooling needed.

MLOps Principles and Key Considerations (updated/modified by author from Summary of MLOps Principles and Best Practices)

Practical MLOps: Pachyderm for data and code+data

There’s too many variables to safely make a call on what software components will come together to solve all of today’s big challenges in data science. That’s why for now, we’re going to take a look at one foundational piece of the Learning Loop, rather than try to predict all the pieces of what a final stack will like now.

Pachyderm architectural and user overview (Image by Pachyderm)
  • In the case of types and formats, data is treated as opaque and general. Any type of file: audio, video, csv, etc. — anything can be placed into storage and read by pipelines, making it generalizable and uncompromising, instead of simply focused on one kind of data the way a database focuses only certain kinds of well structured data.
  • As for the size and quantity of data, Pachyderm uses a distributed object storage backend which can scale to essentially any size. In practice, it is based on object storage services, such as s3 or google cloud storage.
From blog post Pachyderm and the power of GitHub Actions: MLOps meets DevOps (Image by author)

There and Back Again

As we’ve seen throughout this article, data is a foundational element of the machine learning loop. It exists not as a single step before model development, but as a loop that we iterate through during the life cycle of our ML application, allowing our software to learn.

Solving real-world problems with Machine Learning and AI | ML Developer Advocate @ | Computer Science @UniOfOxford | Published @SpringerCompSci

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store