Using Off-Object Observations during inference

carver · November 26, 2025, 4:11pm

I’m interested in taking on this task. It seems somewhat straightforward and self-contained. (famous last words) I’m thinking I would start by working on single-object inference at first, and come back for multi-object scenes later.

I’m wide open to feedback and suggestions. Some questions that I will attempt to answer myself along the way, and am happy to hear recommendations about.

~~Do I add evidence to hypotheses that have no feature at that location, when you are off-object? Or only subtract evidence when a location should be there but is not observed?~~ The linked writeup is clear enough that both positive and negative evidence should be applied.
Do I need to make a choice about the scale of the negative evidence applied? How do I pick what the amplitude of the effect is?
Is this a smooth effect? When you look for a most likely nearest neighbor point in a hypothesis, is the negative evidence stronger as you get further away? (presumably with an asymptote) Or is this effect triggered when the feature is outside some hard boundary and then apply a consistent negative evidence effect?
Am I boxing myself into a bad approach by ignoring multi-object scenes for now?

brainwaves · November 27, 2025, 6:19pm

Hi @carver, thats great! The team will get back to you ASAP.

nleadholm · December 2, 2025, 11:53am

Hi @carver , that’s great that you’re interested in this item! It’s a really interesting one and a problem which we have returned to several times as there are some aspects which are deceptively hard to pin down. I’m just in the process of updating the Future Work page with some more information and then I’ll get back to you.

carver · December 9, 2025, 3:50pm

Thanks! I’m happy to do this asynchronously. Also, if you or anyone feels up for doing a little code walkthrough with me, it would really help me in getting started. (and maybe ease the task of writing up your thoughts?) Whatever works for you.

I’m stepping my way through understanding the code. I’m making progress, but it is a bit slow.

vclay · December 10, 2025, 10:14am

Hi @carver

thanks for your patience! We had an in-depth discussion of this topic with the whole team at least weeks’ research meeting. We will pull the recording up in our publication timeline so you can hopefully still see it this week It’s definitely not a straightforward problem, but we have an action plan now with some smaller substeps.

I think setting up a meeting sometime to step through the related code and how we think the changes could be made would be very useful. I’d be happy to join that meeting (since I think I added most of the original off-object handling code and have been thinking about changing this for at least 3 years now ). Maybe we could look at a date after the meetup next week (December 17th) since until then things are quite busy on my side. But if you have any questions in the meantime definitely feel free to ask!

Best wishes,

Viviane

carver · December 10, 2025, 3:58pm

Wonderful, thanks @vclay !

Hah, yes, I am starting to see for myself that it is not straightforward. But I’m up for a challenge

Yeah, that would be fantastic! I’m generally pretty flexible during the work week. I am in GMT+1 but am happy to find a time that’s convenient in your time zone.

vclay · December 11, 2025, 8:25am

Great! Happy to hear that you are up for a challenge because this definitely requires that

We published last weeks research meeting recording now You can find it here: https://www.youtube.com/watch?v=ncAEi1OmtAE It’s a bit interleaved with some thoughts and observations around visual perception and illusions and is quite lengthy. I can make a short summary of the action items that came out of it next week but maybe this part of the video is a good starting point for the summary (https://www.youtube.com/watch?v=ncAEi1OmtAE&t=5621s).

I’m also in Europe so that will make scheduling much easier I’ll write you a DM about times.

nleadholm · December 12, 2025, 12:06pm

Thanks for your patience @carver , great to have you working on this problem!

I’ve now opened a PR summarizing our current thoughts on this task. Note that “off object observations” are intertwined with what we call “out of model movements”, so it is best to understand both of these settings before attempting to tackle one.

Hope the updated documentation helps and curious to hear any thoughts you have (feel free to comment on the PR itself).

github.com/thousandbrainsproject/tbp.monty

Update Future Work description of off-object movements and out-of-model movements

main ← nielsleadholm:Adjustments-to-future-work-research-pages

opened 11:57AM - 12 Dec 25 UTC

nielsleadholm

+82 -6

An effort to update and consolidate our current thinking on "off object" and "ou…t of model" movements. @hlee9212 I thought you would be interested to see these updated pages given your involvement with the earlier RFC drafts. Let me know any thoughts you have.

In terms of your earlier questions, I think some of these may be answered by this, but for “Do I need to make a choice about the scale of the negative evidence applied? How do I pick what the amplitude of the effect is?” –> sort of, in that this is controllable and you will need to add some logic for handling null observations, although the maximum negative evidence that Monty currently receives (-1) would be a sensible starting point, so in practice this is hopefully a hyperparameter you don’t need to play with.

carver · December 16, 2025, 11:20am

Another question I’ve been thinking about: what might be a good starting experiment to run locally when I’m testing out changes. Preferably something that can run relatively quickly, uses modern configurations, and I guess single-object scenes. Maybe just the first example from the tutorials, randrot_10distinctobj_surf_agent.yaml?

vclay · December 16, 2025, 11:23am

Yes, I think that is a good experiment to start with (should take ~3-5 minutes to run on a laptop). If you want to debug faster, you can also update the config like this:

set n_eval_epochs: 1 (at the highest level of the config)
set object_names to a shorter list (like - mug -bowl)
set wandb_handlers: [] to not log to weights and biases (starts experiment faster and avoids cluttering your wandb with debugging runs)
set python_log_level: INFO or DEBUG to get more informative logs of what is happening

The updated config could look like this

config:
  model_name_or_path: ${benchmarks.pretrained_dir}/surf_agent_1lm_10distinctobj/pretrained/
  n_eval_epochs: 1
  max_total_steps: 5000
  logging:
    run_name: randrot_noise_10distinctobj_surf_agent
    wandb_handlers: []
    python_log_level: INFO
  monty_config:
    monty_args:
      min_eval_steps: ${benchmarks.min_eval_steps}
    sensor_module_configs:
      sensor_module_1:
        sensor_module_class: ${monty.class:tbp.monty.frameworks.models.sensor_modules.Probe}
        sensor_module_args:
          sensor_module_id: view_finder
          save_raw_obs: true
  eval_env_interface_args:
    object_names:
      - mug
      - bowl

I would also test with the distant agent since that policy uses the off-object observations to reverse the previous action and move back onto the object. You can run randrot_noise_10distinctobj_dist_agent for that.

Let me know if you run into any issues with this

carver · December 16, 2025, 1:12pm

I saved this config in speedrun_randrot_noise_2distinctobj_surf_agent.yaml and ran it with python run.py experiment=speedrun_randrot_noise_2distinctobj_surf_agent.

It gave me this error:

Error executing job with overrides: ['experiment=speedrun_randrot_noise_2distinctobj_surf_agent']
Traceback (most recent call last):
  File "/home/carver/code/tbp.monty/src/tbp/monty/frameworks/run.py", line 43, in main
    Path(cfg.experiment.config.logging.output_dir)
omegaconf.errors.ConfigAttributeError: Key 'output_dir' is not in struct
    full_key: experiment.config.logging.output_dir
    object_type=dict

I could add output_dir manually, of course, but the intent here seems to be to build on top of `randrot_10distinctobj_surf_agent.yaml`, so I tried adding this default reference instead:

defaults:
  - randrot_noise_10distinctobj_surf_agent

config:
  model_name_or_path: ${benchmarks.pretrained_dir}/surf_agent_1lm_10distinctobj/pretrained/
  n_eval_epochs: 1
  max_total_steps: 5000
  logging:
    run_name: randrot_noise_10distinctobj_surf_agent
    wandb_handlers: []
    python_log_level: INFO
  monty_config:
    monty_args:
      min_eval_steps: ${benchmarks.min_eval_steps}
    sensor_module_configs:
      sensor_module_1:
        sensor_module_class: ${monty.class:tbp.monty.frameworks.models.sensor_modules.Probe}
        sensor_module_args:
          sensor_module_id: view_finder
          save_raw_obs: true
  eval_env_interface_args:
    object_names:
      - mug
      - bowl

It seems to run okay (in about 12 seconds on my laptop, which is great).

tslominski · December 17, 2025, 4:27pm

Hi @carver, that configuration should work .

There is one caveat, in that dictionaries in Hydra merge (they do not override), so config.monty_config.sensor_module_configs will be merged with what is in randrot_noise_10distinctobj_surf_agent. It just so happens that the values in yourconfig.monty_config.sensor_module_configs and randrot_noise_10distinctobj_surf_agent are the same.

Here’s a smaller config with redundant parameters omitted. It should be equivalent to what you have, although I updated the config.logging.run_name as well:

defaults:
  - randrot_noise_10distinctobj_surf_agent

config:
  n_eval_epochs: 1
  logging:
    run_name: speedrun_randrot_noise_2distinctobj_surf_agent
    wandb_handlers: []
    python_log_level: INFO
  eval_env_interface_args:
    object_names:
      - mug
      - bowl