Artificial Intelligence (AI) is here to stay, over a decade it has been changing the industries on a very accelerated pace and despite many people, even in technology area, still been skeptical about it and some simply do not like it, there is no way back and now it is time to master it and operate AI/ML with same level of maturity industry have for software development, automating it and integrating it with all IT Eco-Systems — this is the MLOps Turn
MLOps (Machine Learning Operations) gained attention on Google’s famous paper from 2015, where they painted a picture that is usually well known by professionals working on machine learning (ML) activities, but that scares who are entering the area.
As you can see above, the activities of creating and training ML models are very small if compared with other surrounding activities required to make these models ready to be used in production and to sustainably scale and react to the changes on the business context and on the macro scenarios that impacts people behavior (see COVID-19).
The Concept of MLOps is an adaptation of the DevOps to include machine learning nuances and steps. It uses similar Continuous Integration (CI) and Continuous Delivery (CD) pipelines and introduces a brand new concept, the Continual Training (CT).
ML software are more complex than traditional coding hence the CI/CD Pipelines differ from traditional DevOps by having other elements to be considered.
In MLOps, CI is more than testing and validating code and components an how they work together and performance in terms of speed, it needs to include validating the data profile, the model accuracy, its performance in relation to the business problems it is trying to resolve.
The same way, CD is not only about packaging and deploying the package for a service/microservice, it is actually about understanding which trained model makes more sense to be deployed, based on concept or data drift, and execute the pipeline to automatically deploy the best model’s prediction service.
CT is new, and it is about having the ability to continually train models, fine tune hyperparameters, orchestrate experimentation, and store metadata that can provide insights to data scientists about possible improved models, or even automatically trigger new CD Pipelines to deploy the most recent models.
Below you can see these steps in orange, as well as data engineering and DevOps steps for a full machine learning lifecycle.
In 2021, most of Cloud Platforms are yet on the initial versions of their MLOps offerings, but discussing this with product teams of the big players and based on what they have showcased in 2020 events, this is a very hot area, with large investments on both managed services and on open source products to support MLOps, what means this will evolve and mature radically in the next couple years, allowing the enterprises to incorporate AI into all aspects of the business, augmenting and extending the human capacity to unimaginable levels at scale.