Causal Cognitive Architecture

I’m curious if any of the TBP folks have done a deeper dive on the Causal Cognitive Architecture hypothesis? From a relatively quick scan of the paper (and Schneider’s 2023 paper) it seems like it seeks to explain how intelligence emerges from a TBP-like model of cortical processing.

3 Likes

Geez, this is quite a piece of work. I dug around a bit and found his repo: GitHub - howard8888/workspace: Causal Cognitive Architecture 8 (CCA8): a Python simulation of a neonatal mountain goat brain and a prototype Robotic Cognitive Operating System (RCOS) kernel. · GitHub

55k lines of spaghetti code, and look at the size of that Readme… :face_with_spiral_eyes:

A significant portion seems LLM-generated:

[The smoking gun]

https://github.com/howard8888/workspace?tab=readme-ov-file#relationship-to-engrams-and-columns

Relationship to engrams and columns

In the full CCA8 picture:

  • Each binding may have engram pointers into column stores (representation layer):

    • a posture binding might have an engram for the proprioceptive/visual pattern of “standing”.

    • a cue binding might have an engram for a particular visual snapshot (“silhouette:mom”).

    • an action binding might have a motor‑related engram representing a learned action pattern (“push_up”).

WorldGraph then plays the hippocampal role:

  • It links these local engrams into episodic and semantic maps, in line with engram and cognitive map theories.

This is exactly the index / representation layer story:

  • Index layer (bindings + edges): discrete nodes for anchors, predicates, cues, actions, organized into a map.

  • Representation layer (columns/engrams): distributed neural‑style representations, pointed to by bindings.

Your “cortical minicolumns are spatial maps” hypothesis fits here by treating each column as a local map over its feature space, with WorldGraph indexing and sequencing them at a higher level.

He’s trying to do so many things at once; a brain evolution engine, an AGI research platform, a cognitive supervisor, a psychiatric model, a goat simulator… Ambitious, but hardly tractable.

For the hell of it, I asked Gemini about how vision is implemented. Basically, visual patterns have SHA256 hashes and semantic tags, matching is based on handwavy “signature similarity”, and “columns” are just storage clusters.

[Gemini summary of CCA8 vision]

The Causal Cognitive Architecture (CCA) does not use a literal, multi-layered structural replication of the V1-V2-V4 visual hierarchy found in the biological brain. Instead, it employs a functionalist approach that translates raw sensory data into a universal “common currency”: the Navigation Map.

The process for recording raw vision data, such as shapes and colors, follows this pipeline:

1. Normalization (The Shaping Module)

Raw vision data (pixels or numeric arrays) enters the system via the Input Sensory Vectors Shaping Module.

  • Normalization: The shaping module normalizes these raw inputs into a standard vector format usable by the architecture.
  • Channeling: There is a dedicated shaping module for each sensory system (vision, auditory, etc.), ensuring the architecture can process disparate data types in a uniform way.

2. Spatial Mapping (The Association Module)

Once normalized, the data moves to the Input Sensory Vectors Association Module.

  • Feature Mapping: Rather than extracting features through hierarchical layers (like V1 for edges or V4 for color), the system treats these features as coordinates. They are spatially mapped onto specialized navigation maps dedicated to vision.
  • Navigation as Cognition: The system operates on the hypothesis that the neocortex evolved from ancestral navigation circuits. Therefore, “recognizing” a color or shape is functionally modeled as “navigating” to a specific location on a feature-based map.

3. Recording and Storage (Engrams and NavPatches)

In the CCA8 implementation, visual information is recorded through a memory pipeline involving WorkingMap and Columns:

  • NavPatches: Complex visual snapshots (like a specific shape or object) are represented as NavPatches (MapSurface patches).
  • Prototypes vs. Instances:
    • Prototypes: Long-term, immutable templates (e.g., a “red ball”) are stored in the Columns module as Engrams.
    • Instances: When the agent “sees” something, it creates a transient instance in the WorkingMap and attempts to match it against these stored prototypes.
  • Engram Snapshots: These “heavy snapshots” contain the rich, high-dimensional sensory content (like the specific color/shape details), while the WorldGraph stores thin pointers to them for efficient indexing.

4. Recognition via Predictive Coding

Instead of a feed-forward pipeline, the system uses a predictive matching loop:

  1. Extraction: The system extracts observed patches from the current EnvObservation.
  2. Comparison: It retrieves candidate prototypes from the Columns based on signature similarity.
  3. Prediction Error: It calculates a “prediction error” (the difference between what it expects based on stored prototypes and what it actually sees).
  4. Update: If the error is high, it may create a new engram (learning a new shape or color); if it is low, it reuses the existing prototype.

In summary, the system “mimics” V1-V4 by replacing the hierarchical layers with parallel navigation modules that map visual features into a spatial grid, allowing for the same functional result (identifying shapes and colors) without simulating low-level spiking neurons.

Here’s a table from the readme that contains the higher-level objects and their purpose:

[Mammalian / Human memory systems and their CCA8 equivalents]

https://github.com/howard8888/workspace?tab=readme-ov-file#mammalian–human-memory-systems-and-their-cca8-equivalents-conceptual-map

This table is a conceptual mapping, not a claim of exact neuroanatomical equivalence. It is intended to help readers orient themselves: “if I know the human memory taxonomy, where does that live in CCA8?”

Mammalian / human memory system What it does (brain-side) CCA8 equivalent (architecture / simulation)
Sensory memory (iconic / echoic / haptic) Very short-lived sensory traces (sub-second to a few seconds) in primary sensory cortex pipelines HybridEnvironment → EnvObservation as the “incoming perceptual stream” (raw_sensors + predicates + cues). Optionally, capture as engrams if you want persistence. The intent is that this stream is transient and can be configured for different time windows.
Short-term memory Passive short holding buffer (~15–30s; classic “7±2” item framing) WorkingMap as a short-term high-bandwidth trace (bounded by max_bindings) plus BodyMap as a tiny register. (CCA8 does not yet enforce strict capacity; instead it provides pruning knobs.)
Working memory (overall) Short-term + active processing (“workspace”) WorkingMap + PolicyRuntime/Action Center + FOA/base mechanisms. WorkingMap holds the local trace; PolicyRuntime/Action Center selects what to do next; FOA/base are the “what is currently relevant?” scaffolds.
• Central executive (WM component) Attention control, selection, coordination PolicyRuntime / Action Center (gating → triggering → executing) plus FOA selection.
• Phonological loop (WM component) Verbal/auditory rehearsal system Not a focus in the goat profile; future “human-like” profiles would likely map this to column/engram payloads + rehearsal-like controller loops.
• Visuospatial sketchpad (WM component) Spatial/visual manipulation (“mind’s eye”) BodyMap + environment geometry (near-space posture/mom/shelter/cliff) and (future) richer engrams for spatial scenes.
• Episodic buffer (WM component) Integrates across WM subsystems and links to LTM A future “bridge” layer: WorkingMap → consolidation into WorldGraph/Columns (partially scaffolded today; details still evolving).
Long-term memory: episodic (explicit/declarative) Personal event memory; hippocampal indexing and retrieval WorldGraph as the episode index + methods that reconstruct trajectories (bindings/edges with provenance). (CCA8 episodic details remain a design focus and will evolve.)
Long-term memory: semantic (explicit/declarative) General knowledge/facts consolidated in cortex (semantic hub concepts) WorldGraph as the symbolic index; optionally memory_mode="semantic" (experimental). Longer-term, “semantic engrams” belong in Columns, with WorldGraph as the pointer/index layer.
Procedural memory (implicit/non-declarative) Skills/habits; basal ganglia + cerebellum involvement Controller policies/primitives and their learned parameters (e.g., skill ledger / q values). This is “how to do things,” not “facts about the world.”
Priming / classical conditioning (implicit) Learned associations (cue → response), often emotion/autonomic linked Autonomic + drive/threshold cues + learned primitives: rising-edge interoceptive cues (cue:drive:*), valence tags, and policy selection shaping. (CCA8 currently expresses this via autonomic tick + cue/policy machinery; richer conditioning is future work.)

I mean, there’s some interesting stuff in there and some similarities with TBT, but “biologically plausible” is a bit of stretch. Lots of top-down handcrafting…

2 Likes

Thanks for sharing this, we have not heard of the Causal Cognitive Architecture before. At the surface, it appears that Dr Schneider is exploring some similar ideas to what has been proposed in prior Numenta/TBP work (Thousand Brains, etc.), as well as in Monty. However there are also some puzzling claims. For example, the paper predicts that “minicolumns” model “navigation maps”. This sounds similar to one of the core principles of the TBT, that cortical columns model reference frames. However, it’s puzzling because minicolums are very narrow (40–50 μm), which is on the order of the diameter of a single pyramidal cell, and each minicolumn only contains around 100 neurons. The Thousand Brains Theory claim that an entire column (sometimes known as a hypercolumn) can model reference frames is already a relatively controversial / bold prediction in the neuroscience community, so it seems strange to predict that individual minicolumns could do this. There is some reference to Jeff’s prior work (suggesting that Dr Schneider is familiar with it), but only briefly when discussing the existence of grid cells outside of entorhinal cortex. If the paper went into detail about how the theory compares and contrasts to the TBT, it would be easier to understand what they are proposing, since there are clearly some parallels, and possibly some inspiration.

The above focuses more on the neuroscience-theory side. As AgentRev pointed out, there is a lot going on, both in the results and the simulations, and I haven’t dug into the details. If you end up playing around with the repo and find some interesting results or capabilities, definitely feel free to share.

2 Likes

Yeah I’m not touching that code….. :grinning_face_with_smiling_eyes:

Even after a quick scan I was skeptical as to the rigor since as you said there was only one passing reference to Jeff’s prior papers and not one mention of Vernon Mountcastle. That’s a red flag.

3 Likes

This Causal Cognitive Architecture paper by Schneider seems to be built on the foundation that the minicolumn is a repeated unit. There’s experimental evidence that the cortical column is a repeated unit in much of the cortex, but I am not aware of such evidence for cortical minicolumns.

There are some interesting concepts in there, temporal binding of sensory inputs, taking snap shots of sensory inputs to track and make memories of motion, and processing in cycles. But I have my usual reservations about tackling high level conceptual structures without having first created low level structures. Linguistic instructions to manipulate spheres and blocks and cylinders. In the real world you need to first have the concepts of spheres and blocks and cylinders, and then how to manipulate them. To be topical, simulating a flight to the moon and actually flying to the moon are very different things.

1 Like