I’ve been looking at the recent talk at Baylor University by Robert J. Marks II (Director of the Walter Bradley Center for Natural and Artificial Intelligence). Marks confidently presents a stern, mathematically-grounded critique of the current AI hype, specifically targeting the idea that AI can “create” new information or achieve true reasoning through current methods.
Marks’ argument is based on his papers “The Famine of Forte” (2017) and “The Futility of Bias-Free Learning and Search” (2019), at this timestamp of the video https://youtu.be/ShusuVq32hc?t=2338.
He asserts that any artificial learning system is bound by the Conservation of Information (COI)—meaning that any “discovery” an AI makes is simply an extraction of information already latent in its training data or its search priors, paid for by a specific information cost.
Will this also apply to Monty?
While Marks’ proofs are robust for systems that function as conditional distributions (like LLMs), he specifically said “We proved that this does not apply just to large language models, but to any artificial learning system”. I suspect the Monty architecture, might be operating in a different class of system for some reasons:
-
Reference Frames vs. Statistical Correlation: Marks argues that AI is “trapped” in syntax because it only processes relationships between symbols. However, Monty’s isn’t just predicting tokens; it’s navigating physical (with the potential for abstract) models of reality.
-
Sensorimotor Integration: Marks uses the “Magic Hat” analogy to show that “producing a rabbit” (an answer) requires a “helper” (the distribution/training). In Monty, the “helper” isn’t just a static dataset, but a continuous stream of sensorimotor input. The “information cost” is being paid for in real-time by the environment, which might bypass the “model collapse” seen in LLMs.
-
Physicalist Grounding: Marks points out that syntax is not semantics. But in the Thousand Brains Theory, meaning is derived from the structural representation of an object in space. If “understanding” is the result of a physical architecture (the cortical column) navigating these frames, it may be a way out of the “syntactic trap” that Marks deems inescapable for “any artificial system”.
My question
On the statement:
"We proved that this does not apply just to large language models, but to any artificial learning system, which means this will not go away if we come up with a new architecture…"
Is Monty, like LLMs, still bound by the law that the understanding of an artificial system can never exceed the information inherent in its architecture and input?
I’m curious to hear if anyone has any ideas around whether Monty’s biological-mimicry approach could eventually make it an exception to this “mathematical ceiling”.