How Compositional models will work in Monty

Hi all, my apologies if these questions have already been answered clearly and are not good questions, I have no formal background in this my background is physics so my approach to understanding may not be optimal. Any advice would be appreciated.

I’m trying to understand how compositional models would work within monty, I have briefly read the hierarchy/heterarchy paper however i quickly realised it outlines how it should work mainly within the neuroscience and translating to monty requires a bunch of thinking that im hoping to skip by someone just telling me the answer to my questions.

Say we have a system with 2 LM’s and a single SM, I assume both LM’s will be hooked up to a SM. my confusion is in regards to how the hierarchy would work in regards to these two LMs. Say we are given a compositional object say a box with a knob. Would it be the case that the LM lower in the Hierarchy would have seen the components by themselves and thus have models of just a box and the knob by itself. and then when given the composed object as its learning a model hypothesis for two objects spike and perhaps theres some mechanism by which when theres such an occurrence that this is communicated to the other LM via CMP and then as the composed object is being scanned the higher LM has some way to recognise that there can be two object IDs associated with this third object ID of the whole object. And perhaps it learns a model of one of the objects with the other model as a feature. So one question is can the features in the CMP include object IDs. I’m sure this picture is quite wrong as its probably possible to learn constitutionality without having seen the components by themselves.

So my confusion is with two LM’s operating in a hierarchy and also in a parallel fashion what do the two LM’s individually lean models of when given a composed object, Does the lower LM learn models of only the components, or the components and the whole, and then does the second LM also learn the exact same models or only the composed model, and since both LMs are hooked up to the motor system and see the same thing which LM decides the principled movements for building the models.

Hi @Lucretius, these are great questions.

Say we have a system with 2 LM’s and a single SM, I assume both LM’s will be hooked up to a SM.

Yes, both LMs will be hooked up to an SM, a higher-level column gets direct input from the thalamus as well. In Monty this translates to another SM with a larger receptive field connected directly to the higher-level LM (without having to pass through the lower level LM). This higher-level LM now gets input from the lower-level learning module hypothesizing about the child object (if present) and larger receptive field input from the sensor at the same time. Note that both SMs receptive fields are co-located.

a model hypothesis for two objects spike and perhaps theres some mechanism by which when theres such an occurrence that this is communicated to the other LM via CMP

As the agent explores the compositional object, the lower level LM is switching between child objects (e.g., box or knob). It never spikes for two child objects at the same time. As the compositional object (e.g., box with a knob) is being learned in the higher-level LM, associations are being formed that effectively store a child object id as a feature at a location in the higher-level reference frame.

one question is can the features in the CMP include object IDs

Yes, an object ID is a non-morphological feature stored on the CMP message constructed by an LM. This CMP message also includes the pose of the child object as a morphological feature.

its probably possible to learn constitutionality without having seen the components by themselves

A higher-level LM can learn a low detail model of the compositional object as it observes direct input from the sensor as well. The idea here is that if the lower-level LM has learned and inferred child objects, these are associated as additional features at locations in the higher-level LM to more efficiently infer the compositional object. In Monty, the evidence from both channels are added together, making the LM infer the object faster.

Does the lower LM learn models of only the components, or the components and the whole, and then does the second LM also learn the exact same models or only the composed model

The lower-level LM learns only the components and the higher-level LM learns the full compositional object. Anticipating a follow-up question here, “what prevents the lower-level LM from learning the full object?”. This is an open research question, we don’t have a definitive answer for when a LM stops learning a model and starts learning a new one. One solution we have implemented is the constrained object models, which automatically prevent an LM from learning a model that exceeds a certain physical size.

My opinion is that child objects are observed much more frequently than the parent objects. By definition, these child objects can appear on many other compositional objects. So there is a statistical component here for detecting the boundaries of what defines a child object. Additionally, there are top-down connections that stabilize and reinforce this division between child and parent objects in a compositional model. The higher-level LM predicts the next feature (object ID) it will see as the sensor moves, and therefore can bias the lower-level LM to switch to the other child object rather than learn to extend the current object.

since both LMs are hooked up to the motor system and see the same thing which LM decides the principled movements for building the models.

In Monty, LMs can propose goal states for the motor system. These goal states have confidence values and the one with the highest confidence wins. The confidence is based on evidence scores of the hypotheses. The motor system can also follow a model-free policy (e.g., saliency-based or curvature following) not informed by any LM. This same process happens any time we have multiple LMs (e.g., voting with 5 LMs at the same level), not only in a hierarchical configuration.

Hope this helps.

1 Like

Many thanks for your detailed reply, ill have a lot to study and regards, currently unable to dedicate too much time to it but i hope to start making some contributions at some point. compositionality seems key to building systems capable of various kinds of abstraction. I have an idea about representing word embeddings as some kind of object via some transformation and then compositionality would give ability to do inference about sentences with enough samples processed, giving a partial inference ie giving only part of a sentence would allow for essentially next word prediction.

1 Like