Various questions mostly focused on motor output and reference frames

Alexis_Balestra · April 28, 2026, 7:43pm

Hello, I have been looking into the thousand brain theory, and wanted to ask some questions about it.

In the current implementation of Monty, it is easy for Monty to know how one of its movements might affect its position on an object, but what about when the relation between movement and position is much more ambiguous? For instance, let’s imagine a brand new brain, the cortical columns do not get a clear explanation on how its motor output translates to a clear movement in its reference frame. Moreover, there might be some obstacle or element that might change over time or temporarily change how a motor output should translate to what movement for the reference frames. So it seems clear to me that there should be some way for the cortical column to learn how movement and reference frame should align. Is there an explanation on how this would work, or if not, do someone have some idea on how it could work?

What about non-Euclidean spaces? While the 3d world has a nice euclidean 3d coordinate system, many things with which we interact every day, like this forum, or other digital apps, for instance, do not have that. Instinctively, I would say that the brain stores most of the understanding of those elements inside a graph rather than a coordinate system, so it might be that reference frames can work in a way that is closer to a graph than a clear coordinate system. Maybe the same mechanism that allows the cortical column to learn what movement corresponds to what displacement in its reference frame can be used to make reference frames act in a much more flexible way?

On a related note, there might be instances where the cortical column is lost with regard to its reference frame. For instance, let’s imagine that the cortical column is dropped somewhere it has seen before, but without being told exactly where it is, so it doesn’t know how to align its reference frame. It would need some way to readjust its reference frame after getting lost. Does anyone have any idea how this readjustment would work or trigger?

It seems to me that the highly abstract concepts would need to be mapped to a space with a high number of dimensions. An analogy would be how LLMs have a high-dimensional embedding space, which allows some vector calculations like “king + woman = queen”. Of course, brains don’t use the same architecture as LLMs, but intuitively, I would think that there would be something similar in the brain. In that case, this would mean that reference frames can have a very high number of dimensions, but how do the grid cells create so many dimensions? Are the grid cells in some cortical column arranged in a way that allows for more dimensions in the reference frame? Can several cortical columns work together to create the equivalent of a reference frame with a higher number of dimensions? Can the cortical columns use the same motor learning system that I described above in order to squish a high-dimensional space into a lower number of dimensions? Am I just misunderstanding something completely?

I also have some questions regarding how a cortical column chooses what motor output to perform. You have the cortical column choose a target location, before using its reference frame to calculate the motor output needed to go to that reference frame.
The target location would need to have several things for it to make sense:

How rewarding that target location was last time it was visited, which would work in the human brain with dopamine or other similar reward chemicals.
The motor output of a higher-level cortical column.
However, beyond that it seems hard to know how the cortical column would choose locations, especially since here we are only talking about location and not feature. Perhaps, the way it actually work is that the cortical column first chooses a feature, then finds the location of the feature thanks to its reference frame?

Now that I think about it, if the target location is deduced in part by the motor output of a higher-level cortical column, this means that the motor output of a higher-level cortical column is equivalent to a location for a lower-level cortical column. I feel like this could be connected to my idea about motor learning. Some cortical columns could be creating “action maps” that would help other columns with how to interact with the world?

One last question, when I try to visualise something, like an apple in my head, am I using the motor output from some of the cortical columns higher in the hierarchy to force lower cortical column to create this visualisation, or am I convincing myself that I am seeing an apple (even if I am not seeing one) in such a way that lower level cortical columns are trying to conform to this new expectation and thus trying to create the image of an apple? I feel like moving some image in your mind is not that different from moving your arm physically, so I would think there is some element of motor output in it, but I am not sure.

jhawkins · April 30, 2026, 5:10pm

Hi Alexis,
Lots of great questions. I will attempt to answer several, one at a time.

“imagine a brand new brain, the cortical columns do not get a clear explanation on how its motor output translates to a clear movement in its reference frame”

Nearly all parts of the brain and nervous system are learning and changing over time. Therefore, a general rule is that any two parts of the brain that communicate need to learn how to talk to each other. The learning has to stay active through life because growth, trauma, learning, etc. requires constant adjustment.

Cortical columns don’t know what their inputs represent and don’t know where their output is going. It all has to be learned. Imagine a column receives input from a sub-cortical motor center. This is just a bunch of axons, some of which are spiking. These axons are correlated with movement, but the column doesn’t know how. In the column this input is fed into a simple neural learning mechanism called a spatial pooler. The spatial pooler results in a set of mini-columns that represent basis vectors, each basis vector, i..e. each mini-column, represents movement in some direction. (You see these movement vector mini-columns in lower layers in the cortex). The result of all this is the cortical column has learned how to represent observed movement in the world, as a set of active mini-columns.

The column now has to learn how to control the sub-cortical motor centers. Once the column has learned how to represent observed movements, the neurons representing the movement send axons to the same sub-cortical motor center that it receives input from. The neurons in the cortical column form synapses with the active cells in the motor center during an observed movement. This is simple associative memory. After this association is learned, the column can direct behavior on its own.

jhawkins · April 30, 2026, 5:13pm

The mechanism I described in the previous reply makes no assumptions about the dimensionality of the movement space. I can learn 2, 3, 4D spaces. It isn’t clear if this is happening in the brain, but the mechanism would work.

jhawkins · April 30, 2026, 5:32pm

We have discussed this topic a lot inside the TBP, but have not reached any firm conclusions. My intuition was the same as yours. It feels like we need high-dimensional spaces, where the dimensions may have little to do with physical dimensions. In A Thousand Brains I wrote about how math or history could be represented in non-physical space. However, there is evidence that we map even abstract concepts using euclidean dimensions plus time. Brains maybe limited in this way. This could explain why abstract ideas are hard to learn; our brains evolved to understand a physical world.

One of the cool things about Monty, and the TBT in general, is that the algorithms can in theory be applied to any type of space, whether it has a physical counterpart of not. If we build systems like this, they will understand the world in different ways, presumably better, than we do.