By Lee Li Feng
In the last decade, we’ve moved from Ops to DevOps to DevSecOps, and this trend of expansion is showing no signs of slowing down. In the past year, in fact, we’ve seen another acronym added: AIOps.
For cynical IT operations staff, it would be easy to write off this new development as hype, as we’ve seen many other ideas that promise to revolutionize the way our jobs work, only to fail at the first hurdle. AIOps, however, might be different. By drawing on recent major developments in AI, it leverages the time-saving ability of AI to make the most from the limited resources available to operations staff.
In this article, we’ll look at what AIOps is, what it can do, and how it can (potentially) take teams to the next level of digital transformation in 2021.
First, let’s get the basics out of the way — what, exactly, is AIOps? Gartner came up with this term a couple of years ago, defining it as the automation of IT operations through the combination of machine learning with big data.
That might sound vague, and it is. However, it’s possible to define the broad outline of how AIOps would work in practice. AIOps platforms would:
- Draw data from multiple sources, both real-time and historical;
- Process this data, and use them to find patterns in the way in which networks are being used;
- Make actionable recommendations to operations staff on how they can improve the efficiency, efficacy, or security of their systems.
For instance, an AIOps platform would be able to look at patterns of network data, and predict that SD-WAN capacity in a few sites is about to increase due to several applications that are using more bandwidth than usual. Alternatively, such platforms could use ML capabilities to isolate anomalous traffic behavior, which may also indicate an impending cyber attack. In other words, the promise of AIOps is that AI and human operators can work together to make the most out of limited resources.
Such systems will make use of systems, such as Alibaba Cloud’s Machine Learning Platform for AI (PAI) platform, which offers over one hundred algorithm components that are each tested on Alibaba Group internal services.
In practice, the implementation of AIOps is likely to be more fragmented than this comprehensive definition suggests. Rather than offering ITOps teams a completely automated platform that covers all the requirements of the role, AIOps will initially be applied to niche applications within well-defined tasks.
This is already happening. In fact, 78% of IT professionals now believe that AI is the technology that will have the greatest impact in the coming years. This perception has been bolstered by a number of recent, high-profile uses of AI.
The most prominent usage of AI in recent times would be the way that AI has been used to fight the COVID-19 — not just in analyzing research data, but also in optimizing contact-tracing networks in real time. Even before the COVID-19 pandemic, researchers have been exploring the way that AI can be bolted onto ITOps workflows, and these investigations have revealed many promising paths for the future.
Let’s now discuss the current trends in AIOps, and those that are likely to make the most impact to ITOps staff in the next few years:
Up until quite recently, ITOps staff have been hesitant to start using AI systems because of the difficulty involved in cleaning the data to be used with them. Operations is an infamously messy business that requires staff to integrate and learn from multiple sources and datasets separately, and up until now AIs have been quite fragile when it comes to these variant data.
Several companies are now offering platforms that overcome this difficulty, or at least claim to. Alibaba Cloud’s ML platform, for instance, is an AI-enhanced network monitoring platform that is able to look at many different types of dataset simultaneously — metric, log, and transaction data — and discover correlations between them.
At a more fundamental level, adding genuine AI tools to the workflows of ITOps staff promises to deliver a much greater degree of visibility and oversight on the systems and networks they oversee. This is due to the inherent ability of AI models to take vast data sets and extract simple, actionable insights from them.
At the moment, ITOps departments are very prone to be overwhelmed by data, in which the sheer amount of data they are collecting makes it impossible to assess how a particular system or network is performing. By taking these raw data and revealing the patterns present in them, AIOps platforms will allow ITOps staff to regain control of their systems.
Less than year ago, many ITOps departments were still exploring the possibility of one day, in the not too distant future, moving a proportion of their staff to remote work. Then the pandemic happened, forcing employees to work from home and forcing employers to ramp up its digitalization efforts to support this “new normal”. This is why over 90% of companies are now working on digitally transforming their organizations, including record-breaking investments into remote working for their employees.
The role of AI in managing remote teams is clear enough, and with an exponential increase in endpoints, the security of these teams is now only manageable via machine learning platforms that are able to identify anomalous activity long before their human analogues.
ITOps staff are no strangers to progress and development: one only needs to look at the speed at which DevOPs and then DevSecOps were implemented in many organizations to see this. AIOps promises to take the increased focus on security that DevSecOps brought, but also provide a way for security teams and operations teams to collaborate.
This is due to the way that AIOps platforms deal with data. Take, for instance, a situation in which a particular endpoint is generating huge amounts of traffic. Instead of presuming that this is due to a misbehaving application (generally an Ops issue) or a malicious actor in a network (a cybersecurity problem), AIOps platforms will take a neutral stance, and compare the traffic pattern to previous instances. This, ultimately, reduces bias in operational tasks, and improves their efficiency.
Of course, there remain a number of issues that AIOps platforms need to overcome before they become a preferred choice for businesses. One of these is that, at the moment, the platforms that can be used to provide this functionality are time-consuming to implement, which means that DevOps teams view their installation as a waste of valuable time.
This is likely to change in the coming years, as AIOps tools are integrated into software already used by Ops teams. In other words, for many teams the advent of AIOps is likely to be a gradual, invisible one rather than an explosive revolution. But ITOps teams should be aware of the advantages of using these tools early in their development, because they will ultimately make networks and systems far more efficient, and IT administrators far more effective.