What Monty features or injections to work on next

I’m creating this topic in an attempt to break my habit of asking @tslominski for extra work in random comment sections on GitHub.

So, what next injection or feature you’d want assistance with? Anything goes.

3 Likes

Hey @AgentRev, thank you for volunteering.

I propose working towards the Desired Effect (333 Telemetry collection is configurable down to the Python module level.).

Leading to it is (Injection #81 Use structured logging to emit telemetry inline).

This is a large project. Basically, the goal is to replace our custom logging framework for all the scientific data with a logging-like telemetry pipeline that emits snapshot events. Essentially, replacing tbp.monty/src/tbp/monty/frameworks/loggers/graph_matching_loggers.py at main · thousandbrainsproject/tbp.monty · GitHub and everything that interacts with it. That is, no longer collecting data at hardcoded step, episode, or epoch boundaries, but instead emit them inline with appropriate event metadata so that any application reading the telemetry snapshot stream can construct whatever it needs for data analysis/visualization purposes.

The Current Reality Tree Undesirable Effect (103 Platform internal representations are difficult to visualize) contains more context on the problem we are running into.

I have only thought through the design of this at a very high level. Essentially, what I’m thinking is something analogous to the logger interface, but for telemetry.

So, where we have logger:

logger = logging.getLogger(__name__)

# ...

logger.error(...)
logger.warn(...)
logger.info(...)
logger.debug(...)

We would have an analogous pipeline (using logging as the implementation detail) that would emit telemetry events:

telemetry = monty.getTelemetry(__name__)

# ...

telemetry.snapshot(...) # some scientific data
telemetry.snapshot(...) # some scientific data

This way, we could reuse all the existing logging infrastructure to select, handle, and filter scientific data emitted by the application.

Aside from the emit side of things, this project also requires rethinking and redoing all of the live and offline visualizations.

This is a large topic. I think we’ll need an RFC to keep everyone on the same page. Before we start going through all the details, I want to see if this is of interest first.

4 Likes

Yeah, sounds great. It’s been a recurring subject of discussion since Brighton, especially in the live view thread. I’ll do some reading and thinking on that, to get into the mindset.

2 Likes

I noticed that @ash has done some work on a telemetry prototype here, perhaps it can be of use: tbp.monty/src/tbp/monty/libs/telemetry at 08b20e113072be40014883dd83d174d34b7b8365 · thousandbrainsproject/tbp.monty · GitHub

Seeing all the other fine work you and the team have been doing over the last weeks, I assume this feature is probably less urgent? If desired, I could also help with shorter-term stuff, like @jshoemaker’s MuJoCo integration.

For the MuJoCo integration, I’ve got a pretty clear roadmap on what needs to happen, and I’m in the middle of the first task, so I’m not sure how easy it would be to work on it in parallel. I’m planning to pull that prototype apart into separate stages, and it’s all very fresh in my head at the moment, including the changes I want to make along the way.

1 Like

Right, so for the telemetry, since it involves many different causes, effects, injections, design choices, etc. I think it would make sense to break it down iteratively, Agile style: discuss A, implement it, discuss B, implement it, etc. Whereas if I understand correctly, the RFC process seems to be more orientated toward “plan A to Z then implement A to Z.”

Would it make sense to instead have a small RFC that just… loosely scopes the telemetry? Then, part A would be like, merge a first basic version of the class, then further parts could be schema integration, how it’s serialized, do we include a database connector, starting to deploy the new class, progressively replacing graph_matching_loggers and integrate it into live views, etc. figuring it out as we go along.

Regarding the splitting of the work. I’m not yet sure if it can be split up. However, if we were to try…

One constraint on what we’d merge into tbp.monty is that we do not want to include dead code. So, practically speaking, if there’s a PR with an unused class, I think it is very unlikely to get merged.

If the work were split, I think we might end up needing to add new telemetry without removing the old way. This can then be done in parts. For example, all learning modules now (also) emit telemetry in a new way. All sensor modules, environments, habitat, mujoco.. I’m not sure what the best way of breaking it up will be, but there’s probably something that will make sense. However, always adding a full implementation to a portion of Monty, as opposed to a partial implementation to none of Monty.

The reason I think we might want to emit both telemetry types is that tools rely on the current way telemetry is aggregated. Even with all the new telemetry in place, there is a tail of tools that would need to be updated. Only once those tools are updated (to be enumerated), would we then fully transition to the new telemetry.

Yeah, that makes sense. I’m not picky about how development and integration are approached. My question is more about where those discussions should take place. I don’t have an “A to Z” plan in my mind for a full-scale RFC, and I don’t think we should plan A to Z in one go either.

Would it make sense to create a loosely-defined “telemetry” RFC that targets Desired Effect 337, where we can begin those discussions? I don’t think we should scatter telemetry across multiple RFCs. Or maybe a single GitHub issue that will stay open until 337 reaches completion?

If desired, discussions about finer details of the different phases could be split across sub-issues or draft PRs (where development will then take place), referenced in the original RFC / main issue.

First order of business in my mind would be defining the core internals of the telemetry class.

2 Likes

How about here 333 Telemetry collection is configurable down to the Python module level

1 Like