What is MLOps and why do we even need it?
In this ever-evolving world of technology. Industrialisation gave rise to physical machines which enabled us to automate different tasks including manufacturing. The world of technology kept progressing and software became the next big industry which allowed us to optimise our businesses and daily life. Software engineering transformed from years of trials and errors after a constant struggle of improving the development process. In the beginning, majority of the software project failed because the traditional software development utilised waterfall technique for project management, upon reaching the deadline day the software development team would discover that the project vastly deviated from the original business requirement. As time passed, agile development started catching momentum and replaced waterfall, which is a collection of different techniques that involves discovering and developing solutions by involving all the stakeholders.
Birth of DevOps:
Towards the end of 2013 cloud computing started catching pace, from the year 2015 to 2016 cloud computing (IAAS) saw a 38% increase in adaption as if offered to create servers in the cloud programmatically.
This was the time when the software industry realised that they do not need a dedicated back-end software team for maintaining a physical server since the server could be created programmatically in the cloud. DevOps often involved cross-domain roles and best practices from the front-end and the back-end. Virtualisation gave birth to many other tools for managing and maintaining the deployed program and optimising the production life-cycle, which would have otherwise been very difficult.
Another revolution in the world of technology:
In the year 2012, for the first time, deep learning researchers won the image recognition competition using deep learning and since then the field of deep learning went up exponentially. Traditional software engineering would rely on human-defined logic and cannot make sense of the data that is sitting in the database. It was about to change with the increasing adaption and research in the area of machine learning, deep learning and data science. The former head of Google Brain project Andrew NG is the pioneer in the field of deep learning and AI technologies. In 2012, his team programmed a neural network that could recognise cats among other objects.
Reference: https://www.theregister.com/2015/05/19/andrew_ng_ai_pioneer_praises_china/
New electricity, and what is that?
Andrew NG worked on many other projects and gave many lectures on machine learning and deep learning. In 2017, what he said became the buzzword in the modern world of technology, he said:
AI is the new electricity.
According to him AI would be driving and transforming most of the existing technologies, businesses and automating them. Research in AI discovered many different techniques for solving different problems. For example, we can now translate speech to text, text to speech, image to text, create an image from text, chatbots, autonomous vehicles, etc. All these applications became possible only because of the advancement in machine learning and artificial intelligence.
Data Science – the sexiest job of the 21st century:
Data science was called the sexiest job of the 21st century by the tech industry due to various reasons.
link: https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
It involved analytics, data munging, training machine learning model and from that model predicting the possible outcome. It was at this time that everyone started becoming a data scientist, which involved learning several machine learning and deep learning algorithms. Data science requires a practitioner to have domain knowledge as well. Companies started hiring data scientists in numbers in the hopes of making the next breakthrough. We started hearing data science everywhere. Data kept growing, and so did the demand for data scientists, the traditional companies also started storing data for optimising their business processes.
Did the data science team felt short?
In 2017 it was projected that artificial intelligence would drive worldwide revenue of around $47 billion. In 2020, according to Statista, AI only generated $22.59 billion in revenue worldwide.
Reference: https://www.statista.com/statistics/607716/worldwide-artificial-intelligence-market-revenues/
Uber laid off its entire AI team.
Reference: https://analyticsindiamag.com/uber-ai-labs-layoffs/
Element AI was sold for cheaper than its value.
What could be the reason?
AI is still projected to grow each year but it has fallen short of its expectation, partially due to Covid-19, but there are other reasons too.
Data scientists can create exceptionally well models but they fall short when it comes to operationalizing the models.
Deployment of the model is not only about putting the model in production but even after the model has been deployed so many things can go wrong.
Wait a minute, what about traditional software engineers, can’t they use their expertise to solve the problem of operationalizing the model?
The simple answer is “No”, the tools used by traditional software engineers for monitoring and health checkup are useless if it comes to analytics. They also won’t be able to tell whether the model has started to downgrade or if there is something fishy with the model if they use traditional software engineering tools.
Some of the requirements for a machine learning model in production.
- The performance of the model also start deteriorating, we should be able to know when the model becomes unacceptable.
- The whole life cycle of the machine learning model from data preparation to model deployment also has to be tracked.
- A model should also be reproducible.
- We should be able to version the model.
- A model should have low prediction latency.
- The model in production should be scalable.
- Model comparison.
- Alert abnormal behaviour of the model in deployment.
- A collaborative workspace where all the data team can sit together and share their work.
- Should be able to alert us about abnormal input data.
- Security issues.
All these problems require both software engineering and machine learning skills.
Fill in the blank:
Physical machines -> Software
Software -> Software life-cycle optimisation ( DevOps ) to solve problems in the software life cycle
Machine learning and artificial intelligence -> _____________?
You are right, the answer is MLOps.
The research paper that has been referenced a lot:
Below image has been taken from the research paper Hidden technical debt in machine learning systems, reference: https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf
In my opinion:
“If AI is electricity then MLOps is the transmission line.”
So far mostly big companies have made the most of AI. Companies like Google, Facebook, Alibaba, etc have enough resources and team to manage AI at scale and in production, however, this does not hold for smaller companies or even mid-tier. Often mid-tier companies have struggled in productionizing and operationalizing the model since there is no fixed standard yet. For AI to reach its full potential, even the small scale companies should be able to serve the AI, ML model at scale.
MLOps is there to solve all these problems and standardize the machine learning workflow. The term “MLOps” is a combination of ML and Ops, ML meaning machine learning and Ops meaning operations. MLOps, AIOps, DSOps, DataOps are all interchangeable
MLOps tools are focused on model management, monitoring, automation and improve quality of machine learning model in production. Many vendors are offering a different solution for solving model management and monitoring issues but there is no fixed standard yet.
MLOps is still very new and will gradually become robust. However many companies are working on different tools, trying to solve the MLOps problem. There are several tools already out there, each has its pros and cons. Some of the MLOps tools are:
- MLFlow
- Amazon Sagemaker Studio
- Algorithmia
- Tensorflow Extended (TFX)
- HPE Ezmeral ML Ops
- Seldon
- Kubeflow
- Paperspace
- Pachyderm
- Azure Machine Learning
- DVC (Data Version Control)
- Google Cloud AI Platform
- MetaFlow
Reference:
Data scientists can solve the problem but to operationalize it we need MLOps, hence if AI is the electricity then MLOps is the transmission line.
Thank you for reading!