Artificial Intelligence: a comprehensive approach

Surrounded by technology.

Not many years ago, talking about artificial intelligence implied evoking an image of viped automatons performing properly human tasks with greater rigor and efficiency than ourselves (C3PO in Star Wars); or on the other side, robots with capacities and intentions harmful to the existence of the human race (Terminator in Terminator, 1984). Thanks to the internet and technology development, this situation has changed and currently this area of computing is especially important in diverse productive sectors.

Although the term first appeared in 1956, coined by John McCarthy, Marvin Minsky and Claude Shannon at the Dartmouth Conference, its evolution and applications will have a very gradual growth until the end of the last century. The limiting reagent in most cases was the computational capacity, that is, the speed with which the computers of the time (authentic mastodons of epic dimensions) carried out the indicated operations (to get an idea of the magnitude of the change, The ENIAC computer presented by John William Mauchly and John Presper Eckert in 1943, was capable of doing 5,000 additions and 300 multiplications per second; in 2018, the Chinese Sunway Taihulight computer was capable of multiplying that amount by 125,000 billion).

After this brief introduction, let’s deep dive into the concepts, their definitions and relationships.

So you are probably wondering, “then, what is Artificial Intelligence?”

Well, according to Merriam-Webster, we can base the definition on two different meanings:

– a branch of computer science that deals with the simulation of intelligent behavior in computers.

– the capability of a machine to imitate intelligent human behavior.

We are going to stay with the first definition to try to locate all the terms that, in one way or another, are currently related to it and, in a certain sense, it can be confusing how they are related to each other. Here is an image that synthesizes all that connections:

Big picture of Artificial Intelligence, Machine Learning and Deep Learning and their relationships.

This is surely one of the most repeated terms in the media in the last 10 years; the emergence of machine learning has been evident for a long time, but what is the relation it has to artificial intelligence? As we can see in the diagram, it is fully contained within the latter, so an appropriate definition would be the following:

Machine Learning is a branch of Artificial Intelligence focused on building applications that can recognize patterns in data (without being explicitly programmed) and can make accurate predictions once new data arrives.

Maybe you are now imagining something especially complex, but a Machine Learning algorithm can be something as simple as a linear regression or as complex as a descending gradient calculation to solve a non-linear equation.

For instance, imagine you want to predict your income given your years of higher education. In a first step, you have to define a function, e.g.:

[income] = y + x * [years of education]

Then, give your algorithm a set of training data. This could be a simple table with data on some people’s years of higher education and their associated income. Next, let your algorithm draw the line, e.g. through an ordinary least squares (OLS) regression. Now, you can give the algorithm some test data, e.g. your personal years of higher education, and let it predict your income.

Simple and clear example of linear regression (red) given a set of points (blue).

While this example sounds simple it does count as machine learning — and yes, the driving force behind machine learning is ordinary statistics. The algorithm learned to make a prediction without being explicitly programmed, only based on patterns and inference.

Machine learning currently has a very important weight in our lives, since it ranges from the recommendation algorithms of the most important platforms (Netflix, YouTube, Google Search) to the optimization of logistics tasks in company warehouses, pest detection in livestock, bank fraud and many more.

In Machine Learning, there is an even more restricted area, which can be defined as follows:

Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.

That is, we do not distinguish this field of work within Machine Learning by the problems treated or by the type of data; only by the algorithms used to arrive at a solution: Artificial Neural Networks (ANN). Here is an example of how they look like:

Standard structure for an Artificial Neural Network.

Consider the example ANN in the image above. The leftmost layer is called the input layer, the rightmost layer of the output layer. The middle layers are called hidden layers because their values aren’t observable in the training set. In simple terms, hidden layers are calculated values used by the network to do its “magic”. The more hidden layers a network has between the input and output layer, the deeper it is. In general, any ANN with two or more hidden layers is referred to as a deep neural network.

The growing improvement in computing capacity has made possible the emergence of this area of work in the last 5 years, to which have also been added concrete and free solutions such as Tensorflow, Torch or Teano.

With these three terms, we define at a high level the encouraging current landscape in the data age. It is impossible to collect here all the classifications and subclassifications that each one of them owns, since attending to diverse characteristics, very different aggregations can be generated.

I hope, at least, to have managed to shed a little more light on a complex world for those who were not so familiar with it and / or to have clarified some doubts for those to whom it was already familiar.

I leave you some bibliographical references that deal from the formal and taxonomic point of view the different problems approached by these branches of the computation:

[1] Theobald, Oliver (2017). Machine Learning For Absolute Beginners: A Plain English Introduction[2] Flach, Peter (2012). Machine Learning

And some that maybe can help you to identify concrete problems where Artificial Intelligence and all subclassifications are specially useful to solve common problems:

[1] Fry, Hannah (2018). Hello World[2] O'Neil, Cathy (2016). Weapons of Math Destruction

Footer