One of the main barriers to putting large language models (LLMs) to use in practical applications is their unpredictability, lack of reasoning and uninterpretability. Without being able to address these challenges, LLMs will not be trustworthy tools in critical settings.
In a recent paper, cognitive scientist Gary Marcus and AI pioneer Douglas Lenat delve into these challenges, which they formulate into 16 desiderata for a trustworthy general AI. They argue that the required capabilities mostly come down “to knowledge, reasoning and world models, none of which is well handled within large language models.”
LLMs, they point out, lack the slow, deliberate reasoning capabilities that humans possess. Instead, they operate more akin to our fast, unconscious thinking, which can lead to unpredictable results.
Marcus and Lenat propose an alternative AI approach that could “theoretically address” these limitations: “AI educated with curated pieces of explicit knowledge and rules of thumb, enabling an inference engine to automatically deduce the logical entailments of all that knowledge.”
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
They believe that LLM research can learn and benefit from Cyc, a symbolic AI system that Lenat pioneered more than four decades ago, and suggest that “any trustworthy general AI will need to hybridize the approaches, the LLM approach and [the] more formal approach.”
What’s missing from LLMs
In their paper, Lenat and Marcus say that while AI does not need to think in exactly the same way as humans do, it must have 16 capabilities to be trusted “where the cost of error is high.” LLMs struggle in most of these areas.
For example, AI should be able to “recount its line of reasoning behind any answer it gives” and trace the provenance of every piece of knowledge and evidence that it brings into its reasoning chain. While some prompting techniques can elicit the semblance of reasoning from LLMs, those capabilities are shaky at best and can turn contradictory with a little probing.
Lenat and Marcus also discuss the importance of deductive, inductive and abductive reasoning as capabilities that can enable LLMs to investigate their own decisions, find contradictions in their statements and make the best decisions when conclusions cannot be reached logically.
The authors also point to analogies as an important missing piece of current LLMs. Humans often use analogies in their conversations to convey information or make a complex topic understandable.
Theory of Mind
Another important capability is “theory of mind,” which means the AI should have a model of its interlocutor’s knowledge and intentions to guide its interactions and be able to update its behavior as it continues to learn from users.
Marcus and Lenat also highlight the need for the AI to have a model of itself. It must understand “what it, the AI, is, what it is doing at the moment and why,” and it must also have “a good model of what it does and doesn’t know, and a good model of what it is and isn’t capable of and what its ‘contract’ with this user currently is.”
Trustworthy AI systems must be able to include context in their decision-making and be able to distinguish what type of behavior or response is acceptable or unacceptable in their current setting. Context can include things such as environment, task and culture.
What the creators of Cyc learned
Lenat founded Cyc in 1984. It’s a knowledge-based system that provides a comprehensive ontology and knowledge base that the AI can use to reason. Unlike current AI models, Cyc is built on explicit representations of real-world knowledge, including common sense, facts and rules of thumb. It includes tens of millions of pieces of information entered by humans in a way that can be used by software for quick reasoning.
Some scientists have described Cyc as a failure and dead end. Perhaps its most important limitation is its dependence on manual labor to expand its knowledge base. In contrast, LLMs have been able to scale with the availability of data and compute resources. But so far, Cyc has enabled several successful applications and has brought important lessons for the AI community.
In its first years, the creators of Cyc realized the indispensability of having an expressive representation language.
“Namely, a trustworthy general AI needs to be able to represent more or less anything that people say and write to each other,” Lenat and Marcus write.
Expressing assertions and rules
By the late 1980s, the creators of Cyc developed CycL, a language to express the assertions and rules of the AI system. CycL has been built to provide input into reasoning systems.
While Cyc has tens of millions of hand-written rules, it can “generate tens of billions of new conclusions that follow from what it already knows” with just one step of reasoning, the authors write. “In just a few more reasoning steps, Cyc could conclude trillions of trillions of new, default-true statements.”
Creating an expressive language for knowledge representation that enables reasoning on facts is not something that can be omitted through a brute-force shortcut, the authors believe. They criticize the current approach to training LLMs on vast data of raw text, hoping that it will gradually develop its own reasoning capabilities.
Much of the implicit information that humans omit in their day-to-day communication is missing in such text corpora. As a result, LLMs will learn to imitate human language without being able to do robust common-sense reasoning about what they are saying.
Bringing Cyc and LLMs together
Lenat and Marcus acknowledge that both Cyc and LLMs have their own limitations. On the one hand, Cyc’s knowledge base is not deep and broad enough. Its natural language understanding and generation capabilities are not as good as Bard and ChatGPT, and it cannot reason as fast as state-of-the-art LLMs.
On the other hand, “current LLM-based chatbots aren’t so much understanding and inferring as remembering and espousing,” the scientists write. “They do astoundingly well at some things, but there is room for improvement in most of the 16 capabilities” listed in the paper.
The authors propose a synergy between aa knowledge-rich, reasoning-rich symbolic system such as that of Cyc and LLMs. They suggest both systems can work together to address the “hallucination” problem, which refers to statements made by LLMs that are plausible but factually false.
For example, Cyc and LLMs can cross-examine and challenge each other’s output, thereby reducing the likelihood of hallucinations. This is particularly important, as much of the commonsense knowledge is not explicitly written in text because it is universally understood. Cyc can use its knowledge base as a source for generating such implicit knowledge that is not registered in LLMs’ training data.
Knowledge and reasoning to explain output
The authors suggest using Cyc’s inference capabilities to generate billions of “default-true statements” based on the explicit information in its knowledge base that could serve as the basis for training future LLMs to be more biased toward common sense and correctness.
Moreover, Cyc can be used to fact-check data that is being fed into the LLM for training and filter out any falsehoods. The authors also suggest that “Cyc could use its understanding of the input text to add a semantic feedforward layer, thereby extending what the LLM is trained on, and further biasing the LLM toward truth and logical entailment.”
This way, Cyc can provide LLMs with knowledge and reasoning tools to explain their output step by step, enhancing their transparency and reliability.
LLMs, on the other hand, can be trained to translate natural language sentences into CycL, the language that Cyc understands. This can enable the two systems to communicate. It can also help generate new knowledge for Cyc at lower cost.
Marcus said he is an advocate for hybrid AI systems that bring together neural networks and symbolic systems. The combination of Cyc and LLMs can be one of the ways that the vision for hybrid AI systems can come to fruition.
“There have been two very different types of AI’s being developed for literally generations,” the authors conclude, “and each of them is advanced enough now to be applied — and each is being applied — on its own; but there are opportunities for the two types to work together, perhaps in conjunction with other advances in probabilistic reasoning and working with incomplete knowledge, moving us one step further toward a general AI which is worthy of our trust.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.