“One day, all software will learn” — but not today.

(Source: Image by Shutterstock and Daniel Jeffries)

One day all software will learn…” How many times have you heard that quote at some machine learning talk?

As an ML engineer and researcher over the last decade, I’ve heard it so many times it makes my eyes roll. Because, when you’re applying machine learning to real world problems, it’s a necessity to separate the hype from reality. And it’s hard to imagine this utopian future when I struggle daily with seemingly trivial failure points in NLP and Speech Recognition models.

This may come as a shock to some, but machine learning development is not done when you achieve…

Data Privacy and the Challenge of Reproducibility

(Source: Image by Author and Shutterstock)

Last week, I went to work on a machine learning example that used the ever popular ImageNet dataset. It’s the classic image classification problem known by every machine learning practitioner at this point. Here’s an image, classify which of the 1,000 categories it fits in.

But I noticed something strange about it this time. First the website went down and when it came back up, everything had changed.

I didn’t think much of it until I realized that the download hashes were different and accuracies for the models were slightly off. …

ClearML + Pachyderm

(Source: Image by author)

“Move fast, think even faster,” is the ultimate goal of data science.

We all want our AI models to do predictions faster and better than we can do them ourselves. Even more, we want to develop those models at lightspeed. But the reality is often a lot different. Running experiments, comparing results, deploying models and monitoring them is not fast or efficient. It’s a slow, tedious, and time consuming.

If we want to speed up the time it takes to get a model from idea to inference we’ve got to get better at performing more experiments faster. …

Label Studio + Pachyderm

(Source: Image by author, Label Studio, and Pachyderm)

The key to building powerful machine learning models is learning “the right things from the right data.” Just as we humans constantly take in new information and update what we think about the world, ML models must continually learn from new data to keep their insights sharp and relevant. Continuous improvement is utterly crucial if you expect your model to work in the real world.

Breast cancer detection in Pachyderm (Image by author)

Breast cancer is a horrible disease that affects millions worldwide. In the US and other high-income countries, advances in medicine and increased awareness have significantly improved the survival rate of breast cancer to 80% or higher. However, in many lower-income countries the survival rate is below 40%, largely due to a lack of early detection systems.¹

Advances in AI and medicine can make massive differences in beating diseases like breast cancer by extending diagnostics, enhancing pattern recognition in imaging, and deploying these resources for those who need them most.

One promising advancement for early detection breast cancer systems is the…

(Source: Image by Shutterstock)

I’ve been building machine learning and data processing pipelines with Pachyderm for a while now. It’s an incredibly powerful platform, but as with most things, there are some “gotchas” along the way. I sadly learned all of these lessons the hard way, but hopefully I can help others avoid the pain through some of these tips and work-arounds.

A comprehensive resource for deep learning in natural language processing and speech recognition.

Cover of Deep Learning for NLP and Speech Recognition

I am extremely excited to announce the availability of our textbook: Deep Learning for NLP and Speech Recognition!

Deep learning has quickly become a foundational technique in almost every AI and machine learning application. The ability to learn complex concepts directly from data and achieve much higher accuracies have rocketed it to one of the most popular research areas in technology. When my colleagues and I began putting this book together, we noticed that there wasn’t a single resource that covered the modern deep learning approaches in…

Jimmy Whitaker

Solving real-world problems with Machine Learning and AI | ML Developer Advocate @ pachyderm.io | Computer Science @UniOfOxford | Published @SpringerCompSci

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store