I’ve been vaguely in tune with Sparse Data Representations (SDRs) since reading “On Intelligence”. Indeed, the idea has reminded me of similar ideas (e.g., cosine similarity, genetic algorithm encodings). However, I’ve also been a bit fuzzy on the details, even after reading some of the docs and such.
So, I went on a fairly deep dive with ChatGPT (it offers no shallow dives :-). Here’s a summary of sorts:
Q: What are the benefits of SDRs in Monty? How are they used, implemented, etc?
In the Monty / Thousand Brains context, SDRs (Sparse Distributed Representations) aren’t just a data structure choice—they’re a theoretical commitment about how perception, prediction, and coordination should work at scale.
I’ll break this down into (1) why SDRs matter, (2) how Monty uses them, and (3) how they’re implemented in practice, with some notes on engineering trade-offs.
1. Why SDRs are valuable in Monty
- 1.1 Noise tolerance & robustness
- 1.2 Semantic similarity “for free”
- 1.3 Composability and union
- 1.4 Stable identity across transformations
- 1.5 Distributed consensus without central control
2. How SDRs are used in Monty
- 2.1 Inside a Learning Module (LM)
- 2.2 Evidence accumulation
- 2.3 Cross-LM communication
- 2.4 Action selection
3. How SDRs are implemented (practically)
- 3.1 Representation formats
- 3.2 Encoding sensory data
- 3.3 Learning = storing SDR associations
- 3.4 Scaling to many actors (your interest)
4. Why SDRs fit Monty better than dense vectors
5. One key intuition to keep in mind
SDRs let Monty think with partial truth.
Q: I’d like more information on how an SDR might be calculated from a collection of numeric values, keywords, etc.
1. Core rules an SDR encoder must satisfy
No matter the input type, a good SDR encoder obeys these invariants:
- Fixed dimensionality
Every encoding is the same lengthN(e.g. 2048 bits) - Fixed sparsity
Exactly (or approximately)Kbits are ON (e.g. 40 bits) - Semantic locality
Similar inputs → overlapping bits
Dissimilar inputs → minimal overlap - Determinism
Same input → same SDR (unless explicitly stochastic) - Union safety
OR’ing multiple SDRs should not saturate too quickly
These constraints drive how you calculate the bits.
2. Encoding scalar numeric values
2.1 Bucketed scalar encoder (classic HTM style)
Suppose:
- Range: 0–100
- Resolution: 1 unit
- SDR size: 2048
- Active bits: 40
- Window width: 40
Steps:
- Divide range into buckets (101 buckets)
- Assign each bucket a contiguous block of 40 bits
- Neighboring buckets overlap heavily
Example:
value = 37
active bits = [base + 37 ... base + 76]
Properties:
value=37andvalue=38differ by 1 bit- Distance increases smoothly
Pros
- Very interpretable
- Perfect locality
Cons
- Requires known numeric range
- Does not extrapolate
2.2 Random distributed scalar encoder (hash-based)
Instead of contiguous blocks:
- For each bucket:
- Use a PRNG seeded with bucket index
- Select
Krandom bit positions
- Adjacent buckets:
- Share
K-Δbits - Differ by
Δbits
This avoids contiguous blocks and:
- Reduces sensitivity to bit corruption
- Improves union behavior
3. Encoding vectors / coordinates (e.g., position, pose)
3.1 Independent axis encoders + union
For a 2D point (x, y):
SDR(x, y) = SDR_x(x) OR SDR_y(y)
Each axis:
- Uses its own scalar encoder
- Uses disjoint bit regions
Example
- x → bits 0–1023
- y → bits 1024–2047
Properties
- Axis independence
- Easy projection & debugging
3.2 Grid-cell–inspired encoders (Monty-style favorite)
Multiple periodic encoders at different scales:
For each scale:
- Project
(x,y)onto a lattice - Activate bits corresponding to nearest phase
- Union across scales
This yields:
- Unique codes over large spaces
- Local similarity preserved
- No fixed boundaries
This mirrors biological grid cells.
4. Encoding keywords / symbols
4.1 Simple hash-based categorical encoder
For a keyword "handle":
- Hash the string with multiple hash functions
- Each hash selects one bit position
- Turn on
Kbits total
Example:
bits = {h1("handle"), h2("handle"), ..., h40("handle")}
Properties
- Orderless
- Deterministic
- Language-agnostic
4.2 Semantic field encoding (related words overlap)
To get:
"handle"close to"grip""cup"close to"mug"
Approaches:
- Manually define synonym groups
- Use co-occurrence statistics
- Union multiple base symbols
Example:
SDR("mug") =
SDR("cup") OR
SDR("handle") OR
SDR("ceramic")
Monty often builds these through experience, not predefinition.
5. Encoding sets, lists, and structured data
5.1 Sets (order doesn’t matter)
Just OR the encodings:
SDR({red, round, metal}) =
SDR(red) OR SDR(round) OR SDR(metal)
This works because:
- Bits don’t interfere destructively
- Sparsity is preserved (up to limits)
5.2 Key–value pairs
Use binding to avoid collisions:
SDR(color=red) = bind(SDR("color"), SDR("red"))
Binding methods:
- XOR (common)
- Permutation + OR
- Circular convolution
Then union multiple fields:
SDR(object) =
bind(color, red) OR
bind(shape, cylinder) OR
bind(material, metal)
This is huge for Monty-style object models.
6. Encoding time and sequences
SDRs don’t encode time implicitly.
You add it by:
- Context windows
- Sequence memory
- Temporal pooling
Example:
SDR_t = SDR(input_t) OR permute(SDR_{t-1})
This creates:
- Directionality
- Temporal sensitivity
7. A worked example (mixed data)
Suppose Monty senses:
touch_force = 3.2
surface = "smooth"
location = (x=12.4, y=7.9)
Compute:
SDR =
SDR_force(3.2)
OR SDR_surface("smooth")
OR SDR_x(12.4)
OR SDR_y(7.9)
This single SDR:
- Represents a moment of experience
- Can be compared, stored, predicted, unioned
8. Why this works so well in Monty
Because:
- Local encoders preserve local meaning
- Union preserves multi-modal context
- Overlap gives confidence
- No encoder needs global coordination
Each LM can invent encoders as long as it:
- Respects sparsity
- Preserves similarity
This is what allows Monty to scale to:
- Many sensors
- Many objects
- Many brains
9. Practical parameter ranges (rules of thumb)
| Parameter | Typical |
|---|---|
| SDR size | 1024–8192 bits |
| Active bits | 20–100 |
| Noise tolerance | ±20–30% bits |
| Union limit | ~10–20 items before saturation |
10. If you want to go deeper next
We can:
- Design a single encoder library usable from Python, Elixir, Rust
- Explore grid-cell encoders in more detail
- Compare hash-based SDRs vs embeddings
- Walk through a real Monty learning trace
Given your Monty + BEAM + language-neutral fabric interests, the on-the-wire representation and binding choices would be a very natural next step.
“on-the-wire representation and binding choices” sounds interesting
Good — this is where SDRs stop being an internal datastructure and become a systems interface, which is exactly your Monty + BEAM + multi-language concern.
I’ll organize this around (1) what has to go on the wire, (2) concrete wire formats, (3) binding choices, and (4) how this fits a Monty-style message fabric.
1. What must go on the wire (and what must not)
You generally do not want to transmit:
- Raw sensor data
- Dense vectors
- LM-internal structures
You do want to transmit:
- Hypotheses
- Evidence
- Predictions
- Constraints
All of those are naturally representable as SDRs or SDR-like packets.
Minimal semantic payload
An on-the-wire SDR message should carry:
| Field | Purpose |
|---|---|
sdr_bits |
The sparse representation |
sdr_size |
For sanity checking |
encoding_id |
So receivers know how to interpret overlap |
confidence |
Optional scalar |
ttl or age |
Decay semantics |
context_id |
Object / episode / task |
This allows receivers to treat the SDR as:
- Evidence
- Prediction
- Query
without knowing how it was produced.
2. Wire representations (practical choices)
2.1 Index list (canonical baseline)
{
"n": 2048,
"on": [17, 104, 231, 901, 1337],
"enc": "touch.v3"
}
Pros
- Language-neutral
- Human-debuggable
- Ideal for UDP / JSON
Cons
- Overhead if many bits
- Slower unions
Best for
- Early Monty POCs
- Debugging
- Heterogeneous nodes
2.2 Bit-packed bytes (performance path)
256 bytes = 2048 bits
Wrapped in:
- protobuf
- CBOR
- MessagePack
Pros
- Compact
- Fast bitwise ops
- SIMD-friendly
Cons
- Less inspectable
- Needs agreement on endianness
Best for
- High-rate LM ↔ LM exchange
- Robotics loops
2.3 Hybrid (recommended)
Transmit:
- Index list for sparse messages
- Bit-packed when density > threshold
Include a flag:
"format": "sparse" | "dense"
This mirrors Roaring Bitmap behavior and works extremely well in practice.
3. Binding choices (this is the critical design point)
Binding answers:
How do we represent “color=red” without colliding with “shape=red”?
3.1 XOR binding (simple, common, risky)
bind(A, B) = A XOR B
Pros
- Fast
- Invertible
- Common in HDC literature
Cons
- XOR destroys sparsity
- Error-prone under union
- Not biologically plausible
Verdict
I would not recommend XOR for Monty.
3.2 Permutation binding (Monty-friendly)
bind(key, value) = permute(key) OR value
or:
bind(key, value) = OR_i(permute_i(value))
Properties
- Sparsity preserved
- Union-safe
- Robust to bit loss
Implementation
- Predefined bit permutations per key
- Cheap bit-shuffles
This aligns very well with:
- Grid-cell logic
- SDR unions
- Actor systems
3.3 Region-based binding (wire-simple)
Divide SDR space:
| Region | Meaning |
|---|---|
| 0–511 | attributes |
| 512–1023 | values |
| 1024–1535 | locations |
| 1536–2047 | time |
Then:
SDR(color=red) =
SDR("color") in region A
OR SDR("red") in region B
Pros
- Extremely simple
- No bitwise ops needed
Cons
- Less flexible
- Harder to extend dynamically
Great for
- Early Monty experiments
- Cross-language safety
4. Encoding identity & versioning
You will want encoding IDs on the wire.
Example:
"encoding": {
"namespace": "monty.touch",
"version": 3,
"size": 2048,
"k": 40
}
Receivers can then:
- Compare SDRs safely
- Reject incompatible messages
- Maintain backward compatibility
This matters a lot once you have:
- Multiple LM types
- Multiple teams
- Long-lived agents
5. Confidence, decay, and belief flow
SDRs represent what, but not how much.
Common patterns:
5.1 Scalar confidence
"confidence": 0.73
Used for:
- Voting
- Pruning
- Action selection
5.2 Temporal decay
Each hop:
confidence *= 0.95
This prevents:
- Zombie hypotheses
- Old beliefs dominating
5.3 SDR reinforcement
Instead of a scalar:
- Repeat transmission
- Union with itself
- Bit reinforcement counters locally
This is closer to biology and actor systems.
6. How this fits your BEAM-style fabric
This maps beautifully:
| BEAM Concept | SDR Analogue |
|---|---|
| Actor mailbox | SDR hypothesis queue |
| Message passing | Evidence exchange |
| Supervision | Confidence decay |
| Failure isolation | Partial SDR loss |
SDRs are:
- Idempotent
- Order-insensitive
- Loss-tolerant
Which makes them ideal for:
- UDP
- PubSub
- Gossip protocols
7. A concrete message example
{
"type": "hypothesis",
"context": "object_42",
"sdr": {
"n": 2048,
"on": [12, 87, 304, 901, 1456],
"encoding": "vision.object.v2"
},
"confidence": 0.62,
"ttl": 5
}
Any LM can:
- Compare overlap
- Merge beliefs
- Ignore if incompatible
No centralized schema needed.
8. Strong recommendation (based on your goals)
For Monty POC → scale:
- Sparse index-list wire format
- Permutation or region-based binding
- Explicit encoding/version metadata
- Confidence + decay
- Union as the primary merge operator
This keeps:
- Python easy
- Elixir fast
- Rust safe
- UDP viable
9. Where to go next
If you want, next we can:
- Design a formal SDR message spec (fields + types)
- Sketch a Python ↔ Elixir encoder library
- Prototype permutation binding code
- Simulate gossip-based hypothesis convergence
Given your Thousand Brains / Monty work, the formal message spec would probably give you the most immediate leverage.
2. Encoding scalar numeric values
- 2.1 Bucketed scalar encoder (classic HTM style)
- 2.2 Random distributed scalar encoder (hash-based)
3. Encoding vectors / coordinates (e.g., position, pose)
- 3.1 Independent axis encoders + union
- 3.2 Grid-cell–inspired encoders (Monty-style favorite)
4. Encoding keywords / symbols
- 4.1 Simple hash-based categorical encoder
- 4.2 Semantic field encoding (related words overlap)
5. Encoding sets, lists, and structured data
- 5.1 Sets (order doesn’t matter)
- 5.2 Key–value pairs
6. Encoding time and sequences
7. A worked example (mixed data)
8. Why this works so well in Monty
9. Practical parameter ranges (rules of thumb)
10. If you want to go deeper next
Q: “on-the-wire representation and binding choices” sounds interesting
Good — this is where SDRs stop being an internal datastructure and become a systems interface, which is exactly your Monty + BEAM + multi-language concern.
I’ll organize this around (1) what has to go on the wire, (2) concrete wire formats, (3) binding choices, and (4) how this fits a Monty-style message fabric.
1. What must go on the wire (and what must not)
2. Wire representations (practical choices)
- 2.1 Index list (canonical baseline)
- 2.2 Bit-packed bytes (performance path)
- 2.3 Hybrid (recommended)
3. Binding choices (this is the critical design point)
- 3.1 XOR binding (simple, common, risky)
- 3.2 Permutation binding (Monty-friendly)
- 3.3 Region-based binding (wire-simple)
4. Encoding identity & versioning
5. Confidence, decay, and belief flow
- 5.1 Scalar confidence
- 5.2 Temporal decay
- 5.3 SDR reinforcement