Text, Language, and Interpretability in Monty

Rich_Morin · April 12, 2026, 5:25pm

There is a lot of current AI work on attempting to integrate logic and reasoning (GOFAI, redux?) with LLMs et al. So, this difference may not be truly fundamental in practice. That said, I’d love to see Monty incorporate and handle more snippets of text in b its messages.

I’ve written previously about the idea of using textual tags (e.g., cup, handle, logo) to ground the opaque object IDs that Monty produces. These could be added in assorted ways. For example, in a supervised learning setup, Monty could be told in advance “what it will be looking at”. Alternatively, a trained LLM (etc) could be given Monty’s inputs and results and asked to identify objects that Monty has spotted. This information could then be folded into Monty’s knowledge base.

Given a small vocabulary to work with, I could also imagine an LM generating a textual description of an object’s behavior (e.g., “The upper part of the stapler can rotate 160 degrees away from the base, using the hinge at the rear.”) Although the descriptions could be in almost any form of text (e.g., YAML), humans would probably prefer natural language.

Going the other way, an LLM could convert the human’s natural language queries and comments (e.g., “Try looking at the back of the stapler.”) into quantitative data that Monty’s math-based tooling could use.

In short, framing the textual interaction in terms of interpretability may be too limiting: IMHO, Monty should be given affordances which improve its interoperability with humans, LLMs, and ???.

Bryce_Bate · April 13, 2026, 12:46pm

Rich (@Rich_Morin ), I’ve also thought about what a hybrid approach would mean for Monty with an LLM’s ability to interact quite well in natural language. Functionally, this form of integration might make sense in the short-term. But I believe that bolting on the elements of reasoning will prove inadequate in some of the more challenging contexts like medicine where our standards for knowing are much higher. For me, it comes down to intrinsic vs extrinsic justification for knowledge claims. Or, put another way, it’s where the knowledge grounding resides–first-hand (what one has observed) vs second-hand (what one has been told). To achieve this level of warrant, I think an AI must have a way to verify its hypotheses through its own sensorimotor loop. That makes it intrinsic. TBT is the only architecture I know of that lays out this intrinsic feature.

In the high-stakes medical contexts (and eventually many other significant contexts to which we may wish to apply AI), I tend to think of this as analogous to the difference between a medical student who has only passed a 1000 written tests on the procedure and a surgeon who has successfully performed the procedure 1000 times. Loosely speaking, both might be said to “know” the procedure well. But strictly speaking, only the surgeon warrants the trust. No amount of additional tests passed by the student will earn that. That’s why I suggested the difference was fundamental and not one met with incremental improvements in chain-of-reasoning or symbolic logic reporting. Intrinsic evidence stems from the investigation itself–the sensorimotor movements and hypothesis testing that only come from traversing the reference frames. There will always be the difference between first-hand and second-hand evidence. Still, how it gets communicated might well benefit from a hybrid approach. I agree with you on that.

Rich_Morin · April 13, 2026, 4:54pm

It’s clear that GOFAI, LLMs, Monty, and for that matter humans have very different ways of collecting and processing information. However, all of them are constrained by their inputs, and various forms of input can be valuable, even when:

the information is second-hand
its accuracy is questionable
it conflicts with other information
…

I recently watched a fascinating, if lengthy (~1.25 hour) StarTalk episode about the history of science in general and medicine in particular:
Scientists Who Were Persecuted for Being Right, with Matt Kaplan

What happens when scientists are right and nobody wants to hear it? Neil deGrasse Tyson, Chuck Nice, and Gary O’Reilly explore the frustrating history of brilliant minds who were ignored, mocked, and punished for telling the truth with Matt Kaplan, science correspondent at The Economist and author of Told You So: Scientists Who Were Ridiculed for Being Right.
They start in the 1600s, with Galileo and the Church that insisted Earth was the center of the universe and trace how scientific rigor slowly, painfully emerged from institutions that weren’t always ready for it. Learn about Pierre Charles Alexandre Louis who dared to question whether leeches were actually helping patients, published data that showed they weren’t. Was the leech lobby too powerful? Discover Ignaz Semmelweis, the Hungarian doctor who figured out that doctors themselves were spreading childbed fever only to be fired, exiled, and committed to an asylum.

Not every scientist was a martyr, though. Louis Pasteur knew how to work a room, command the spotlight, and stay on the right side of politics. The contrast between Semmelweis and Pasteur raises an uncomfortable question: should scientific genius require a publicist?

The episode closes with Katalin Karikó, the mRNA pioneer who was stripped of funding, threatened with deportation, and dismissed by academia all before her work became the backbone of the COVID-19 vaccine and earned her a Nobel Prize. Matt and Neil reflect on what her story, and all these stories, reveal about science’s necessary conservatism, the dangers of anti-science sentiment, and why getting better at communicating how science actually works has never mattered more.

Bringing this back to the topic at hand, I’d like Monty to be able to accept (provisionally :-) all sorts of information and more generally, exchange information with others. Until it can do that, its capabilities will be severely constrained.