A bird’s eye view of machine learning in the Enterprise

If you are a science fiction reader (like my husband is), you might have noticed a pattern throughout such literature that Artificial Intelligence being the antagonist of stories. Of course, there are a few fictional works with altruistic AIs, but they are far fewer. The fact remains that tales of science fiction and their fictional AIs offer a window to reflect upon ourselves as humankind.
In reality, though, what we commonly call “AI” is not an attempt to replicate any of the aforementioned creations of imagination. More often than not, the AIs of science fiction are self-aware & conscious; they know they are of this world. They make decisions and actions driven by this knowledge of their own existence. However, the various AI components we deal with on an everyday basis in the enterprise world are neither close to self-awareness nor attempt at that problem.
I have an anecdote to explain what I mean by this. A few years back, I happened to take an interest in neural networks. One of the fun projects I worked on was training a computer to recognize (or approximate) a hex representation from an image of color. For example, cyan must resolve to #00ffff. Here, I would like to emphasize that none of what I am about to describe was invented by me. I was merely implementing an algorithm, just as one would implement an algorithm for sorting or search. However, it is a good enough example (albeit primitive by today’s standards) to get my point across.
To teach a machine to identify a hex string from an image of color, you first need a bunch of images that represent colors. I performed the implementation expecting to resolve a hex string to one of 4 values: cyan, purple, magenta, and peach. The algorithm called for a three-level structure: a bottom layer with inputs, a ‘top’ layer with the output, and a middle layer of ‘logic’ nodes. Each input fed into every logic node and was assigned a “weight”. Each logic node, in turn, had a weighted connection to an output node. The weights act as simple multipliers, such that logic node 1 would get Input Node 1 * weight (w1,0) + Input Node 2 * weight (w2,0) and so on. All the subsequent logic nodes would also derive mathematical values from similar formulae.
The idea is that you could feed an image to the input nodes; some math would happen in the whole system. This would result in a number for each output node representing the ‘likelihood’ of that output. So, if you fed in cyan as your input, you want the value associated with the cyan output to be the highest.
You would provide the system with a known set of inputs and outputs and a ‘correction algorithm’. The system is started up with a predefined combination of weights and fed a series of known inputs. Depending on the actual output and the desired outcome, weights are adjusted using a formula — for simplicity, something like multiply by 0.8 if the wrong output is produced, but divided by 0.8 if the right output is produced. Assuming the feedback algorithm is robust enough, the system, given sufficient time, will arrive at a combination of weights that will produce the desired output every time!
However, having done all of this, if I introduce a new desired output, say beige, and feed in its corresponding input, the system will most likely produce the wrong output. I can, however, train the system again by feeding it this new known input and output. It would arrive at a unique combination of weights that produce the required output for all the old values AND the new value.
Modern neural networks are much, much more complicated than this description, but similar intuitions still hold. One can imagine that a small system iterating over a small dataset in the example would very likely get the outcome for a previously unseen input very wrong. However, do this over a large enough dataset, with a smart enough feedback algorithm, and you could have a system that can make a good ‘guess’ for an input it has never seen before. This is called ‘supervised machine learning’, where we train a system using a labeled set of data that we already know the answers to. This is the typical “AI” at work in the enterprise space today.
This post suggests that to solve problems, such as the one described above, there is no attempt at cognition or self-awareness in play. Instead, these are robust mathematical approaches to build complex functions that can provide desired output for a given pool of inputs. Given enough horsepower and a large, varied, and accurate training set, such models can “solve” some fascinating problems. While I recognize that there are people out there working on truly cognitive AI, AI in the enterprise software industry that we use to solve our real-world problems today is far from the mystical AI characters we have created in our minds.

Footer