Does the "Conservation of Information" apply to the Thousand Brains Theory?

I’ve been looking at the recent talk at Baylor University by Robert J. Marks II (Director of the Walter Bradley Center for Natural and Artificial Intelligence). Marks confidently presents a stern, mathematically-grounded critique of the current AI hype, specifically targeting the idea that AI can “create” new information or achieve true reasoning through current methods.

Marks’ argument is based on his papers “The Famine of Forte” (2017) and “The Futility of Bias-Free Learning and Search” (2019), at this timestamp of the video https://youtu.be/ShusuVq32hc?t=2338.
He asserts that any artificial learning system is bound by the Conservation of Information (COI)—meaning that any “discovery” an AI makes is simply an extraction of information already latent in its training data or its search priors, paid for by a specific information cost.

Will this also apply to Monty?

While Marks’ proofs are robust for systems that function as conditional distributions (like LLMs), he specifically said “We proved that this does not apply just to large language models, but to any artificial learning system”. I suspect the Monty architecture, might be operating in a different class of system for some reasons:

  • Reference Frames vs. Statistical Correlation: Marks argues that AI is “trapped” in syntax because it only processes relationships between symbols. However, Monty’s isn’t just predicting tokens; it’s navigating physical (with the potential for abstract) models of reality.

  • Sensorimotor Integration: Marks uses the “Magic Hat” analogy to show that “producing a rabbit” (an answer) requires a “helper” (the distribution/training). In Monty, the “helper” isn’t just a static dataset, but a continuous stream of sensorimotor input. The “information cost” is being paid for in real-time by the environment, which might bypass the “model collapse” seen in LLMs.

  • Physicalist Grounding: Marks points out that syntax is not semantics. But in the Thousand Brains Theory, meaning is derived from the structural representation of an object in space. If “understanding” is the result of a physical architecture (the cortical column) navigating these frames, it may be a way out of the “syntactic trap” that Marks deems inescapable for “any artificial system”.


My question

On the statement:

"We proved that this does not apply just to large language models, but to any artificial learning system, which means this will not go away if we come up with a new architecture"

Is Monty, like LLMs, still bound by the law that the understanding of an artificial system can never exceed the information inherent in its architecture and input?

I’m curious to hear if anyone has any ideas around whether Monty’s biological-mimicry approach could eventually make it an exception to this “mathematical ceiling”.

2 Likes

The way I see it, and why it is not a problem for Monty, is because Monty does online learning and generates its own data by movement. So, even if Monty is limited to extracting only latent information in its training data, Monty’s training data is continually updated from its interactions with the environment (where the environment happens to be the Universe)… you know, like a human :wink:.

3 Likes

1. Clarke’s First Law

Coined by science fiction writer and scientist Arthur C. Clarke:

“When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.”

As I understand it, Sir Clarke was denied a patent on the use of geostationary satellites, because this was deemed to be impossible.

4 Likes

I wouldn’t take Marks’ claims too much to the letter, he seems to focus mostly on LLMs and purely statistical learning systems. The following is from his video’s companion article:

If my only method of reaching conclusions is Pavlovian impulse conditioned on statistical correlation, I am utterly incapable of valid inference. This is why LLMs do not reason in the formal sense. They cannot, given their current feed-forward, cause-and-effect mechanism. Humans, by contrast, can arrive at conclusions based on the truth of propositions. If P is true and P → Q, then I rightly and justifiably conclude that Q is true once I grasp the truth of the premises. It doesn’t matter if that conclusion has been realized a million times previously, or exactly zero times—my reasoning is not based on statistical regularity. It is based only on the truth of the antecedent and the implication. It is built from ground-consequent logical connection, not cause-and-effect conditioning.

Although module voting is a statistical process, Monty inherently operates on ground truth data (information provided by direct observation and measurement) and not abstract symbols, so it kinda falls outside such critiques.

3 Likes