Despite the hype surrounding AI, creating an intelligence that rivals or exceeds human levels is far more complicated than we have been led to believe. Professors Gary Marcus and Ernest Davis have spent their careers at the forefront of AI research and have witnessed some of the greatest milestones in the field, but they argue that a computer beating a human in Jeopardy! does not signal that we are on the doorstep of fully autonomous cars or superintelligent machines. The achievements in the field thus far have occurred in closed systems with fixed sets of rules, and these approaches are too narrow to achieve genuine intelligence.
In contrast, the real world is extremely complex and open. How can we bridge this gap? Gary Marcus and Ernest Davis believe that the next development opportunity for AI is to “learn from” human thinking, because humans are still far superior to machines in understanding and flexible thinking.
Based on thinking about cognitive science (psychology, linguistics and philosophy), the two professors put forward 11 suggestions for the development of AI in the book “Rebooting AI”.
From behaviorist psychology (behaviorism), Bayesian reasoning to deep learning, researchers often come up with simple theories to explain all human intellectual behaviors.
Firestone and Scholl put forward a point in 2016: “There is no single way to summarize the way the human brain thinks, because’thinking’ is not a specific thing. On the contrary, the brain’s thinking is composed of different parts, and each part operates The way is different: the way the human brain thinks when it observes a color is different from the way it thinks when planning a vacation, and the way it thinks when planning a vacation is different from understanding sentences, moving limbs, remembering facts, or feeling emotions.”
The human brain is extremely complex and diverse: there are more than 150 clearly distinguishable brain regions, about 86 billion neurons, hundreds (or thousands) of different types; trillions of synapses, and each synapse has Hundreds of different proteins.
A truly intelligent and flexible system is likely to be very complex, just like the human brain. Therefore, any theory that simply summarizes intelligence as a principle or a “main algorithm” is doomed to fail.
Cognitive psychology research focuses on internal representations, such as beliefs, desires, and goals, as does classic AI.
For example, in order to describe President Kennedy’s famous trip to Berlin in 1963, people often add some facts, such as part-of (Berlin, Germany), visited (Kennedy, Berlin, 1963). These representations gather to form “knowledge”, and reasoning is built on this cornerstone. On this basis, if you infer the fact that “Kenny Dubai visited Germany”, then your reasoning is not worth mentioning.
At present, deep learning attempts to use many vectors that can roughly capture the current event for rough reasoning, but this cannot directly represent the proposition at all.
In deep learning, there is no specific method that can represent visited (Kennedy, Berlin, 1963) or part-of (Berlin, Germany); all descriptions are only roughly close to factual descriptions. Deep learning is currently stagnating in reasoning and abstract reasoning, because it was not used to represent accurate factual knowledge from the beginning. Once the facts are blurred, it is difficult to reason correctly. The GPT-3 system is a good example. The related system BERT cannot give reliable answers to questions such as “If you put two trophies on one table and add another, how many do you have?”
Many things we know are very abstract. For example, “X is the sister of Y” can describe the relationship between many different people: Malia is the sister of Sasha, Princess Anne is the sister of Prince Charles, and so on. We not only know who are sisters, but also what sisters generally refer to, and apply this knowledge to individuals.
If two people have the same parents, then we can infer that their relationship is siblings. If we Laura is the daughter of Charles and Caroline and we find that Mary is also their daughter, then we can infer that Mary and Laura are sisters.
The representations that form the basis of cognitive models and common sense are constructed from abstract relationships and combined in complex structures. We can abstract almost everything: moments (such as “10:35 PM”), spaces (such as “Arctic”), special events (such as “Lincoln Assassination”), sociopolitical organizations (such as “U.S. State Department”), and Theory constructs (such as “grammar”), and uses these abstract things to explain or fabricate stories, looking at complex situations from the essence, to reason about various things in the world.
Marvin Minsky once put forward a point: We should regard human cognition as a “mental society”, which contains dozens or hundreds of different “agents”, and each agent specializes in different types of tasks. For example, drinking a cup of tea requires the interaction of GRASPING agent, BALANCING agent, THIRST agent and other MOVING agents. Many works of evolutionary and developmental psychology have pointed out: the brain contains not only one kind of thinking, but many kinds of thinking.
However, machine learning prefers to use end-to-end models with a single homogeneous mechanism with fewer internal structures, which is exactly the opposite of the above view.
Good AI researchers usually use hybrid systems when solving complex problems. For example, in a Go game, if you want to win, you need to combine deep learning, reinforcement learning, game tree search and Monte Carlo search. Watson is in Jeopardy! Victory on the Internet, such as Siri and Alexa Q&A robots, and web search engines have used the “kitchen sink method”, integrating many different types of processes. In “The neuro-symbolic concept learner: Interpreting scenes, words, and sentences from natural supervision”, Mao et al. introduced a system that combines deep learning and symbolic technology. The system is used in visual problem solving and image text retrieval. Produced good results. Marcus also discussed many different hybrid systems in “The next decade in AI: four steps towards robust artificial intelligence”.
Even in a fine-grained model, the cognitive system usually contains multiple mechanisms.
Take verbs and their past tense: In English and many other languages, some verbs become past tense through simple rules (such as walk-walked, add ed directly after the original form of English verbs), and other verbs through irregularities Form past tense (such as sing-sang, bring-brought).
Based on the data on the mistakes made by children when turning verbs into the past, Gary Marcus and Steven Pinker proposed a hybrid model. This model has a small structure even in a microscopic situation, in which regular verbs are generalized according to rules, and the past tense of irregular verbs is generated through an association network.