ANN topology to AGI - RFC

I would like to request review of the following work, my intent is to find persons interested in collaboration. Feedback would be much appreciated. This work is preliminary: I have been working my way thru a problem set to define the next generation of ANN topology. Current goal: develop high level framework to define core functionality and uncover key questions to determine the areas which will require more data to sharpen. Here are the core functionality requirements that I am building around: 1). Continuous learning is a requirement to achieve true general intelligence. 2). Temporal awareness is a requirement to achieve meaningful continuous learning. 3). Multiple Carrier frequencies analogous to brainwaves are necessary to define ANN state and are a requirement to facilitate temporal awareness.

My full ann-research journal is a public repo at the following url: GitHub - reh3376/ann-research: Phase 2 of ANN topological feature build out ¡ GitHub

1 Like

FWIW, the Goog sez:

Artificial Neural Network (ANN) topology refers to the structural arrangement of nodes (neurons) and the connections between them within a network. It defines how information flows—unidirectionally or in cycles—and dictates the network’s ability to learn and process complex patterns. …

Agreed. A review of the linked repo will make that more the fact that I concur clear..

Your MDEMG appears to be a hippocampal-like, text-based RAG system for LLMs, JEPA is a vision transformer that absorbs millions of hours of videos, and Monty is a non-ANN 3D object recognition system (as of now). All three operate in very different modalities.

Both LeCun and Hawkins hope that their respective systems will independently reach AGI-level capabilities eventually.

You say that your system will “satisfy a set of substrate-level commitments that current architectures do not supply”, listing the 5 following elements: continuous learning, generative modeling, reference frames, meta-learning, and recursive composability.

Yet, I’m fairly certain that these objectives are already part of the roadmap for both JEPA and Monty. Are you implying that either approach is insufficient for these, and that it requires an even broader overarching framework? If so, what is the rationale for this claim?

You mention that MDEMG is “the operational cognitive substrate this program builds on”. MDEMG appears to be a text-centric system. However, text is a cultural invention that came into existence as a result of sensorimotor general intelligence. So, to me, trying to build AGI using a text-centric foundation is the equivalent of saying that the chicken must come before the egg. Wouldn’t it make more sense to use a sensorimotor-first foundation? :wink:

Finally, what are the outputs and their modalities? A system must have clearly defined inputs and outputs, lest it become an unbounded problem.

2 Likes

@AgentRev thank you for the response. There are multiple questions here so I will address in order of appearance:
Q1: 'You say that your system will “satisfy a set of substrate-level commitments that current architectures do not supply”, listing the 5 following elements: continuous learning, generative modeling, reference frames, meta-learning, and recursive composability.

Yet, I’m fairly certain that these objectives are already part of the roadmap for both JEPA and Monty. Are you implying that either approach is insufficient for these, and that it requires an even broader overarching framework? If so, what is the rationale for this claim?"

A1: Roadmaps are potential paths not realized and validated work. Yes, I propose that nether encompass what I call out as foundational requirements in the original post: “… 1). Continuous learning is a requirement to achieve true general intelligence. 2). Temporal awareness is a requirement to achieve meaningful continuous learning. 3). Multiple Carrier frequencies analogous to brainwaves are necessary to define ANN state and are a requirement to facilitate temporal awareness.”

Q2: “You mention that MDEMG is “the operational cognitive substrate this program builds on”. MDEMG appears to be a text-centric system. However, text is a cultural invention that came into existence as a result of sensorimotor general intelligence. So, to me, trying to build AGI using a text-centric foundation is the equivalent of saying that the chicken must come before the egg. Wouldn’t it make more sense to use a sensorimotor-first foundation?”

A2: The coupling between my ann-research journal and the mdemg framework is only a moderately coupled relationship. at most. This is likely my fault for not being more specific and clear. The initial use case that got me working on mdemg was very specific to issues my dev teams were seeing with existing frontier LLMs. As I built the mdemg framework it exposed deeper flaws in the fabric of LLM topology that I had not deeply considered. The ann-research journal is my attempt to do so. Therefore, mdemg is not the framework I am using to further build out, it was the path to the final “straw that broke the camel’s back” and gave me the motivation to formalize these concepts that I have been informally juggling.

Interesting. Skimming through it, here are a few random thoughts: IMO your journal correctly identify that LLM are I/O modules. I agree with you that LLMs implement the functions of Broca’s Area and Wenicke’s Area, not the whole brain. The whole brain general algorithm should be inspired from the transformer algo, but it is not a transformer on elements of semantic meanings either.

The brain is composed of about 200 regions, implementing about 200 specific competences. Our DNA cannot implement a lot of different algorithms, so the call for an universal algorithm is sound. So the DNA create the specific I/Os wirings and possibly a zone hyperparameter tuning of the general algorithm.

I am also interested in AGI, and I am trying to define which of those 200 competences are, what are the common processing each zone is doing. I am convinced that trying to put everything in one LLM, or two, or one LLM with a database will not do it. Our brain has about 7 zones specialized in different aspects of human face processing alone, if there was a way to put everything in 3 or 4 processing zones, nature would have preferred that.

I have been exploring a repeating framework using TypeDB by Vaticle arranged as a hybrid nMLN model. I do not want to limit my research to only sensory inputs that humans process.
A thought experiments I keep revisiting is: What if all other sensory inputs remained the same but humans were able to see the entire electromagnetic spectrum. How would this change our perspective of reality? AND What would a mostly BNN complete ANN learn if we gave it sensory input far in excess of human BNNs? Would this create an intelligence completely foreign to us? Would we be able to understand it…?

Help me understand your github structure and methodology. Everything seems highly structured and controlled with both human and AI-readable components. It’s an interesting approach for managing an intellectual process and I’m just wondering what kind of tools you are using.

Asking for a friend…

A little context is necessary to provide a clear explanation: About 18 - 24 months ago I started developing what seems to commonly be called an ‘engineering harness’ defined as a scaffolding for interacting with LLMs to govern and constrain interaction. The goal was to address the know limitations and weaknesses of the tool: 1). Context windows: As complexity of interactions increase the reliable portion of the context window decreases, once that useful portion is exceeded unwanted behaviors manifest in more and more disruptive behaviors which emphasize their stateless nature. 2). Due to this generally stateless and probabilistic nature, LLMs are not reliable enough to do practical scientific, engineering and R&D work. 3). The static / frozen nature of LLM ‘knowledge’ renders it incapable of producing novel content. 4). RAG and LoRA type fine-tuning are not sufficient to constrain these behaviors reliably.
I called it MDEMG - multi-dimensional emergent memory graphs (www.github.com/reh3376/mdemg). I was able to develop a harness capable of achieving my goals with a confidence interval approaching 0.90, but moving past the 0.90 mark was not manifesting due to multiple factors - compute constraints, my knowledge of subject matter, general ignorance of historical work on this topic, …. To augment these short comings I decided to create a highly structured specification framework. You will find documentation related to these tools in the mdemg repo in the -/mdemg/docs/features directory it is called UxTS (Universal x= <1 or 2 characters to hint at function> Test Specification. For example: to store org / topic specific knowledge in layer 0 of the MDEMG structure I used parsers that were called to chunk files for vector embedding in a 3072D vector space. Using UxTS I created the UPTS - universal parser test specification: I used it to build 28 parsers in this manner. It uses runners and CI type workflows that run on repo push triggers to ensure the schema of each parsers instance complies with the UPTS specification, if it does not comply it logs and alerts user. I am trying to remain high level, but hopefully this provides enough context to understand its use and relevance.

Answering question: I use a version of mdemg as a harness to ideate across problem sets in a spec driven manner using latest frontier LLMs (current Opus 4.7).
This methodology creates a journal like repo to track my process. The ann-research repo I ref is one such instance of these interactions. This may be too high level to be useful. I would be more than happy to explain in a better manner via remote meting via zoom / teams / meet if you would be interested.

1 Like

@Roger_Henley That sounds quite a lot like OpenHuman:

It has a Markdown-based memory (along with a chunk database service):

Perhaps it would suit your needs better in the immediate term while Monty is brewing? :wink:

As to your earlier points, “Continuous learning is a requirement to achieve true general intelligence” and “Temporal awareness is a requirement to achieve meaningful continuous learning.” are indeed part of TBP’s goals. “Multiple Carrier frequencies analogous to brainwaves […] are a requirement to facilitate temporal awareness” - that one is outside of TBP’s scope, because Monty doesn’t use a neural network. However, its planned Global Interval Timer will assume that responsibility.

1 Like

Interesting. The spacio-temporal awareness is just one aspect of the carrier frequencies analogous to brain waves. I believe many such carrier waves will be necessary to provide spacio-temporal awareness, biasing states for various functional areas of the topology, various other whole system feedback mechanisms. For example:grades of - heightened awareness, restful restructuring, self examination, recursive reflection, …

From what I understand, brain waves are essentially cortical traffic lights, for which a computer has its own arsenal, e.g. mutexes, semaphores, interrupts, etc. Temporal awareness can be dealt with clock timestamps or variable bitrate frames like media codecs. LMs could also ingest time as a dimension and act in a way similar to temporal convolutional networks, with different heterarchy levels handling different time windows. Plenty of options.

Awareness can likely be dealt with thru saliency, increased sensor module poll rate, and thus increased LM compute / utilization. By “restful restructuring” I assume you mean hippocampal replay and consolidation? It’s quite possible the need for rest is simply a biological limitation that a machine can overcome. The team had some interesting talks about that in this video; Hawkins says it plainly with “Monty doesn’t need to sleep.”

Heh, those are within the purview of theory of mind. My opinion about all ToM stuff is very straightforward; chimps are not ToM-capable, while humans are. The main cortical differences between chimps and humans are more dendritic spines, and plenty more feedback connections from the PFC to other regions. So, logically, these two are what you need for ToM to emerge.

For TBP, this would translate into heterarchical structuring requirements; increased inter-LM exchange in the higher levels, and more top-down feedback connections. (Although, I certainly do not underestimate the challenge this represents. A thousand different network topologies may all turn out to be insufficient! Only time will tell, but I trust in the team’s diligence.)

I would be highly surprised if brain waves turned out to have anything at all to do with ToM. To me, this would be akin to saying a car’s cruise control can only be used within a specific engine RPM range.

1 Like

@AgentRev fair hit, and a useful one. Let me first correct a framing of mine that probably earned the deflation: I wasn’t claiming anything about how brain waves actually work in biological brains. I was using them as an analogy — an intuition pump — and most of your reductions land, so I’ll take them: coordination by mutexes/semaphores/interrupts, time by timestamps or VBR framing or TCN windows. Agreed. But let me say what I’m actually after, because it reframes the lot: not a base with temporal features bolted on, but a single core unit — repeatable the way a cortical column is — that meets the primary criteria for general intelligence in itself rather than having them supplied to it. The mechanisms you list are features added to a substrate; I’m trying to define the substrate.

The narrower question inside that: should state and spatiotemporal awareness be properties of the unit itself, or supplied to it from outside the way timestamps and TCN windows do? My working hypothesis bets on the former. My current candidate for how is FM, AM, and PM of carrier signals in the core infrastructure — three independent, well-characterized modulation channels. You mentioned VBR framing like codecs; I’m poking at the layer beneath that.

I want to be clear this is a working hypothesis and nothing firmer. I’m at the front of a normal loop — form it, build a test framework, collect data, adjust as the data dictates. The analogy isn’t arbitrary, but I’m treating the biology as motivation, not evidence; the architecture has to earn it on the bench, and your objection is exactly the kind of thing that decides whether it does.

On ToM — closer than my wording suggested. I didn’t mean waves cause reflection, only that a global signal would modulate the regime it runs in; your cruise-control analogy is mine. Dendritic spines plus PFC feedback as the substrate — no quarrel.

This is the sharpening I hoped the RFC would surface. If you’re game, I’d take the carrier-wave question deeper over a call — same problem from a Monty-shaped angle, and I’d learn from the friction.

1 Like

TBP seems like a place I would be interested in. I am currently transitioning away from my VP of Engineering & Technology role at WHK. I have decided that I want to spend the last decade of my career doing something that fully interests me, and will benefit humanity.

1 Like

Brainwaves synchronize the brain. For instance, aphasia is a loss of sync in one or both of the temporal lobes. Depression might be a synchronization problem as well. They are either the actual sync mechanism, or a byproduct of it.

I am not making any definitive statements about the purpose of brain waves in BNN’s, I am not qualified to do so, and I do not have the data necessary to make a claim like that. I am simply using them as an analogy to develop an intuition around the use of a series of carrier waves ( carrying information via AM, FM, and PM) in an ANN to provide a mechanism for setting ‘state’ and providing a framework for building spacio-temporal awareness as infrastructure. Now I need to build it, collect data, analyze that data and adjust as the data dictates. The standard process every real scientist uses to do meaningful work. I am not asking for more guesses, I am doing more of that myself than I am generally comfortable with. What I am asking for is evidence this may or may not work in the form of current research and or white-papers.

The “waves” observed on an EEG are the combined electromagnetic field of billions of digitally-spiking neurons in on/off state. Think of peaks as “information is moving” and valleys as “information is not moving”. That’s very different from a “carrier wave” in the RF engineering sense, where the information is always moving in a continuous manner.

Carrier waves are used in the radio world to multiplex several parallel information streams. In the brain, the information is the individual spikes themselves. All of the neurons already operate in parallel via digitally-transmitting axons, they don’t need to demultiplex anything.

If time becomes a property of the unit itself, you are essentially operating in analog regime on digital chips, at a huge compute overhead penalty. In my opinion, this would be equivalent to hammering a screw. You should be optimizing toward the digital regime first and foremost.

In the brain, temporal dynamics are all imposed by the atomic speed of electrochemical reactions. In a computer, the temporal dynamics are imposed by its clock frequency, which vary from machine to machine. By using an external time source like TBP’s interval timer concept, you can easily uncouple AI algorithm processing from the temporal samples, letting the learning modules fly, without affecting temporal data. If you choose a temporal substrate, you are committing to a discrete-step simulator, which brings its own cans of worms in that regard.

And after all that, you still need to figure out how the carrier waves combine together to produce intelligent output. This is the hardest part by a long shot. There are some contenders in that space, especially in the field of reservoir computing. Your ideas remind me about the late Walter J. Freeman’s work, e.g. The KIV model of intentional dynamics and decision making. Although, I never managed to wrap my head around his theories. He does still have a few supporters out there.

I was discussing the design of my sensory neural network for Arachne with a friend, explaining how every change in sensory input will cause a ripple of activity throughout the network resulting in either a motor output, a memory adjustment or network structural adjustment. Consequently the speed of responses to sensory input changes is limited by the settling time of the network.

My friend pointed out that perhaps biological neural networks are pipelining processing such that multiple sensory input changes are being processed at the same time, each at a different stage on its way though the hierarchy. It is this pipelining that enables the rapid responses to sensory input changes, and that brain waves are evidence of the pipelining. So those waves of spiking are each processing different sensory input events.

If this is true, then emulating biological neural network functionality just got an order of magnitude even more difficult. How do you handle feedback when the neurons further back are dealing with the next wave?, or perhaps it all just comes out in the wash?

If I had a definitive answer to that question I would already have a test model underway. The truth seems to be that we are still a significant way from understanding BNN’s in a manner that easily translates to the buildout of similarly configured ANN’s. I would love to have the funding to dive far deeper into that research, but as of yet haven’t found a benefactor willing to support it, lol.