Since Artificial Intelligence (AI) algorithms have become popularized, especially by their employment in some of entertainment and social media’s most popular products or since they can rather successfully drive cars, they have come under public scrutiny like never before. People from all sorts of fields are, no doubted under the influence of celebrity scientists talking about AI, such as Elon Musk or Steven Hawking, wondering about the ethical implications of allowing algorithms to certain aspects of our lives and whether or not their career runs the risk of being replaced by the new technology. We would like to show in this essay that perhaps the these issues are not framed nearly as coherent enough as their proponents would believe, by trying to put side by side a couple of similarities but also key differences between how machines learn and how humans do.
A note should be made that, in computer science, “artificial intelligence” is an umbrella term which encompasses machine learning (ML) algorithms, but also expert systems, deterministic systems or any other piece of code that is able to make decisions by itself. However, to simply, we shall use the terms “AI” and “machine learning” interchangeably since the latter are the ones people have in mind when discussing the field and the beneficiaries of almost all of the academic and research attention in the field. Technically, machine learning is a sub-field, but it is the most relevant one when it comes to the problem of learning.
Two types of ethical concerns tend to present themselves in the public discourse: specific and general. The former are generally presented by experts of the field and concern very narrow problems in particular systems: for example, algorithms that produces biased results when calculating credit scores for men and women because there was bias in the data. General issues tend to be oriented towards the future and usually presume the ability of implementing a General Artificial Intelligence (GAI), id est, an algorithm that could solve tasks from various domains of life. However, nothing even close to that exists today. Historically, in “traditional” software engineering, the programmers would write a set of instructions, the algorithms, that, after being presented with an input, would produce an output. The paradigm is slightly shifted in machine learning. An algorithm is presented with both input and output and its task is to analyze the relationships between the two and construct a model, which can be later used to compute the outputs of other inputs based on its discoveries. The algorithm has to come up with the set of instructions during this process, which we call “training”. Since meaningful input data can only come from one corpus, the resulting models are therefore task-dependent; they cannot be applied to anything else. ML algorithms create idiot savants that, as a famous analogy in the field says, would continue to play chess even if the building is on fire.
From a technical perspective, the process of training consists in optimizing mathematical functions which try to capture the relationship between input and output as well as possible. One very common notion one hears when exploring the literature is that the more data the system is fed, the more it will learn. This statement is somehow misrepresentative, since it depends on what is being understood by “more”. What the system learns is how to better capture input-output relationships by optimizing the parameters of the function it is working with. It does not learn more in the sense that a child would, by having a more variate knowledge of the real world. Granted, there are also unsupervised methods of machine learning, where an output is not provided to the algorithm, but rather it is expected to find patterns that humans would have otherwise missed. However, the essential concept of working to optimize a mathematical function remains the same.
Since ML systems are practical applications of various branches of mathematics, one of the most important philosophical problems we can then pose is the extent to which reality or the process of learning can be codified in mathematical expressions to a satisfactory extent. The potential of current artificial intelligence, as function-optimizing systems, is therefore not unlimited if it can be shown that there are problems within mathematical systems that cannot be solved. Fortunately, this has been achieved in 1931 by Kurt Gödel with his two incompleteness theorems. Gödelshowed that any sufficiently expressive mathematical/logical system is either incomplete or inconsistent. Since accepting inconsistency would be a disaster, as the system could be used to prove any predicate and would therefore serve no purpose, we are coerced to accept that our mathematical system in question is incomplete. That means that problems will arise within the system for which the axioms used in building it will no longer be able to solve them. While incompleteness can be quite easily illustrated by the example of the “halting problem”, a slightly easier to digest version of this, although not identical, is Basarab Nicolescu’s “principle of included tertiary” which states that, for a given A and non-A in a system S1, there can exist a T in a superior, higher system S2 such that T is both A and non-A. When they first learn subtraction in school, children are taught that the minuend has to be a number higher than the subtrahend. This is to be expected, since they are working under the mathematics of natural numbers. Within that system the “3–5” operation is nonsensical. However, the problem can easily be solved when we transcend to the a integer-based system, where the operation is perfectly valid. Another example would be that in traditional physics, quantum entities could be described as either particles (A) or waves (non-A), however within quantum mechanics, photons and electrons possess properties belonging to both particles and waves (T).
While the principle of included tertiary helps us understand how reconciliations can be made within new systems in order to solve previously impossible problems in the old ones, Gödel’s incompleteness theorems apply to any logical system. Even if we create a new mathematics based on adding to existing axioms, that new system in itself will be subjected to the incompleteness theorems. The metaphysical implications of Gödel’s work are incredibly far-reaching and deserve separate studies, however, we shall be content with expressing that we believe they provide sufficient foundation for proving that, since our ML algorithms are entirely within the domain of mathematics, they are inherently limited.
Since we can consider both linguistics and computer science to study different versions of the concept of formal languages (natural languages and programming ones), it would be worth considering how humans learn from the point of view of language acquisition and related theories of the mind. In order for an algorithm to learn to accurately distinguish and classify even as few as two objects, it has to be provided with many examples and corrected along the way. After the training process, it might still have issues with objects that are similar but not in the same class as what it was supposed to recognize. When children learn about new objects with the use of language, i.e., they learn the word for what they are seeing for the first time, they rarely need to be provided with a second example. The reason behind it is that the word-label helps the human mind codify a concept, from which it can generalize and then particularize again infinitely. More than being a simple sequence of sounds used in communication, the word for “table” encapsulates the idea of ‘table-ness’: even if one encounters a four-legged table made of wood and then a two-legged table made of marble, the similarities of design, intended purpose, etc, help humans to not misclassifying the second instance.
How humans learn in general is still very much an open debate with countless merit-worthy theories. Repetition, imitation, and environment undoubtedly all play vital roles, however, we must consider theories from the likes of Noam Chomsky or Steven Pinker who draw attention to the fact that there seem to be innate mechanisms, properties of the brain and mind themselves, that help humans know. Chomsky is, of course, very famous for his theory of ‘universal grammar’ which states that all children are born with a knowledge of a universal set of rules and patterns that seem to exist at the core of every natural language. No matter how different, all language differentiate between verbal and nominal categories, they all seem to follow the ‘Subject did Action to Object’ pattern and so on. When children learn their mother tongue, according to this theory, what they actually learn is how to adapt Universal Grammar to its specific requirements. A powerful argument to support these claims is that not only can children learn to produce infinite sentences after being exposed to finite examples but they have an intuitive understanding of which sentences are grammatical and which are not before being exposed to formal teaching in the field or even to enough linguistic data.
Aspects of innate knowledge do not seem to be restricted to the realm of language acquisition alone. For example, babies are surprised when, after putting a teddy bear behind a blanket, you pull out a plane because they expect that the nature of the reality they inhabit is such that objects do not transmute. There is no reason for why they should have this expectation before experimenting more with the world of studying theoretical physics, however, many similar experiments seem to suggest that a great deal of aspects pertaining to learning may be innate and the nurture side of the famous nature-versus-nurture debate cannot explain everything. The scholarly debates about many such aspects are, of course, multi-faceted or even open to interpretation. If the theories emphasizing innateness are true, it would be impossible to replicate the process with algorithms and mathematics.
On the other hand, support for the advancement of artificial intelligence could also come from evolutionary psychology. In 1997’s “How The Mind Works”, Steven Pinker tried to distance himself from what he termed as the “standard social sciences model” (SSSM). According to the latter paradigm, the mind as a general-purpose cognitive device is entirely shaped by environment and culture. This presupposition has been tacitly accepted as scientific fact by academics everywhere, especially after the counterculture movement, in an attempt to over-correct for the horrors created by the racial theories of the previous century. Pinker takes an approached centered on evolution, biology and what could be thought of as ‘human nature’ and notices that both the denial and the excited affirmation of the latter have been used in left as well as right wing movements to cause harm. It would be, however, unscientific to discredit it.
Pinker frames all of our big questions in the terms of evolution theory. It is interesting to note that he does not consider intelligence something the evolutionary process has to culminate in, or strive to produce. Evolution is more like a tree than a ladder, with natural selection choosing whatever the organism needs for its particular circumstances. In order to work with morality and other philosophical aspects of human behavior within such a paradigm, it is underlined that the existence of something in nature is by itself neither right nor wrong and that innateness, as described above, does not overrule free will or exempt from responsibility.
Steven Pinker is a proponent of something called the “computational theory of the mind”. It is a transdisciplinary effort from linguistics, philosophy, and cognitive science to form a view where the human mind acts as an information processor and cognition as a form of computation. In short, according to Pinker and his colleagues, the mind, which is not the brain, but rather some of what the brain does, and the modular, non-monolithic way in which it does it, engages in understanding, which, as we mentioned is, at its core, computation: manipulation of symbols and concepts that happens at an incredibly fast rate with immense amounts of information. Ergo, much of cognition is represented by back-propagating neural networks structured into programs for manipulating symbols. Pinker reduces rational and flexible thought to smaller and smaller networks of information processing, but he does all of this in the aforementioned framework of evolutionary psychology. This becomes very apparent when, for example, he notes that our minds are not adapted to deal with abstract entities. They have adapted to deal with objects and forces in nature: fighting, food, health, etc. We keep the inferences under such concepts, but remove the content and apply it to different domains, resulting in metaphorical thinking. The examples are few and, although he does explain some (such as love is a patient in sayings like sick relationship and virtue is always up in sayings like high-minded), we cannot help but point out this feels like an over-extension, as humans can learn the concepts behind abstract entities with no examples.
The computational theory of the mind has gained popularity among many public thinkers who are also interested in artificial intelligence, the most prominent of which at the time of this writing would probably be Yuval Harari (see “Homo Deus: A Brief History Of Tomorrow”, 2015). If all the human mind does at its core is some kind of processing of symbols, perhaps we can create mathematics sufficiently expressive to represent them in code and achieve a step further towards obtaining General Artificial Intelligence. We would like to mention that computational theories do a rather poor job of explaining creativity and artistic expression to a degree that is compatible with what talented people have to say about their own craft. Figures as diverse as Michelangelo talking about sculpting David or Steven King explaining the process of writing a novel or even Hans Zimmer speaking about composing music will, by and large, state very similar ideas. Michelangelo famously explained that David was inside the block the entire time and all he had to do was remove the surplus marble. Zimmer, just like Michael Jackson, claims that he hears music play inside his mind out of the blue. King goes even further with the spiritual-sounding metaphors and claims that stories “come” to writers, who are merely the instruments of bringing them into existence. Just like the value of art produced by ML algorithms, the issue of creative expression is one of the most important problems in aesthetics and it is conceivable that we will not reach a satisfactory answer soon. However, a holistic approach should at least consider transcendentalist arguments, especially if they are coming from the artists themselves. The “New Age” movement has shown that, despite reaching unprecedented level of technological and scientific achievement, people continue turning to spirituality. Perhaps this need too is biologically encoded.
Learning and subsequent issues are also deeply linked with the problems of consciousness and intentionality, which complicate matters a great deal in discussions about AI. Free will and consciousness seem to come at odds with the idea of innate characteristics and neuroscience in general, however there exists a trend of looking at it as an emergent property of the mind, that cannot be explained by any of its parts alone, just like flying is not a property of any of the parts that go into the makings of a plane. David Eagleman does a great work of exposing this concept in his award-winning 2011 book, “Incognito: The Secret Lives Of The Brain”. Intentionality in the sense of “will” with regards to machines is also discussed by Steven Pinker in “Enlightenment Now” (2018). He emphasizes this vital difference between the ability to do something and the desire to actually do it and states that even if we can code algorithms with great ability, he doubts we can code intentionality in the sense that human agency has it. ML systems would therefore be no different than any other tool: there at our disposal, to use as we please and driven by our intentionality.
We could conclude with a short overview of the extent to which learning in AI algorithms can presently impact our lives. At the time of this writing, there is nothing in the literature that even remotely suggest the possibility of achieving GAI in the near future and, throughout this essay, we have tried to show that it is possible that may not happen at all. Far from the dystopic visions of overcoming our abilities or subjugating humanity that popular culture depicts, AI is being used today to automate many important tasks and even take decisions that surpass those humans would have made in certain areas. Perhaps counterintuitively, Pedro Domingos’s suggestion in “The Master Algorithm” is that the jobs that will be safe from AI are not necessarily those that demand a high level of intellectual work, but rather those that are dependent on context of the real world. Thus, a doctor is far more likely to be replaced (AI can be used to diagnose already) than a nurse. Domingos believes that humanities will be on the rise, while extremely narrow tasks runs the risk of being automated. Regardless, the most important take-away would be an understanding of how to properly frame the problem of learning in systems of artificial intelligence. Tools are simply that. Historically, the case has never been ‘tool versus man’, but rather ‘man with tool versus man without’.