Paper: Task learning increases information redundancy of neural responses in macaque visual cortex

I stumbled upon an interesting paper that appears to provide some empirical support for both the distributed and heterarchical nature of Thousand Brains Theory. It revolves around another theory named “generative inference” and doesn’t mention TBT, but the results seem applicable to both.

PDF: https://drive.google.com/file/d/1irgi_2c335yurKHX0OscEQYEzOzi_TkQ/view?usp=sharing

Long summary from https://www.science.org/doi/10.1126/science.adw7707

Introduction

How does the brain transform sensory input into perception and behavior? The classic model guiding most of neuroscience and modern deep learning views perception as a largely feedforward process: Sensory signals are transformed from early to higher visual areas to make behaviorally relevant information more explicit. Feedback connections are thought to merely fine-tune this process—enhancing relevant features or suppressing noise during attention and learning.

An alternative framework, generative inference, posits that sensory processing is fundamentally bidirectional. In this view, neurons represent beliefs about causes in the external world, continuously updated by the exchange of information between sensory evidence (feedforward) and prior expectations (feedback).

Rationale

A recent theoretical prediction from the generative inference framework offers a decisive way to empirically distinguish these two models. Generative inference models predict an increased sharing of task-related information among sensory neurons while learning a perceptual decision-making task—manifesting as higher redundancy in their responses. This prediction directly opposes the classic model, which holds that learning and attention reduce redundancy and correlated variability to improve coding efficiency.

To test these conflicting predictions, we measured changes in information redundancy among neurons in visual area V4 of two macaque monkeys as they learned to discriminate between two orientations in two separate tasks (cardinal and oblique). Neural activity was recorded chronically using Utah arrays over weeks of training. We quantified information redundancy as the difference between the linear Fisher information carried by the intact population activity and that carried by the same population after removing correlations.

Results

At the start of learning, redundancy was near zero, indicating largely independent neural responses. Over the course of training, redundancy increased, ultimately reaching levels where roughly half of each neuron’s information was shared with other recorded neurons. Redundancy also increased dynamically within trials, over hundreds of milliseconds, consistent with the gradual accumulation and sharing of information. Increased redundancy did not result in a loss of information in the population but, instead, the individual-neuron information increased—both predicted by generative inference. Learning-related changes in redundancy were stronger during task performance compared with passive viewing on the same day, which suggests that the increase in redundancy owing to a redistribution of information depends on active task engagement.

Conclusion

Learning a perceptual task increased information redundancy among sensory neurons—a result that contradicts conventional understanding of the roles of learning and attention. Rather than eliminating correlated variability, learning appears to redistribute information across neurons through feedback and recurrent interactions, enabling consistent beliefs about the sensory world. These findings suggest that cortical sensory processing is best understood as a dynamic inference process—one that integrates prior expectations and sensory evidence—challenging the long-held assumption of a fundamentally unidirectional information flow during sensory processing in the brain.

4 Likes

Hey @AgentRev, neat paper! It’s always nice to see experimental results that challenge the feed-forward view. I think the community has been moving towards alternatives for a while, but it’s probably still the dominant paradigm.

I would like to see a couple of variations with this setup.

  1. To my understanding, they should see the same effect in lower visual areas. V4 is a big jump up from V1. It’d be interesting (and critical) to find out whether information redundancy increases there as well.
  2. I’d expect larger neural participation in just about anything task that involves reward, which could appear as information redundancy. It’s pretty common, and it’s one of the reasons why it’s hard to square experimental results between studies that involve reward vs passive viewing. So it’d be nice to a passive viewing condition as a control.

That said, monkey studies are resource limited (compared with, say, mice), so maybe studies like that will show up down the line.

Those are just my thoughts on the paper sans a TBP-specific perspective. I’ve been mulling over how to distinguish TBP from generative inference from an information-redundancy perspective for about a week, and I’m not really sure how to answer that yet. We spend most of our time thinking about what kinds of information travels between columns and layers, and I’m not sure yet would predictions I would make about how this would as measurable by a multi-electrode array.

Cheers,

Scott

1 Like

I looked up V1 redundancy, I couldn’t find anything immediately similar, but I did stumble upon 2 related papers (which are also both cited in the task learning paper, unsurprisingly).

The first paper (Bondy et al. 2018) found that the correlated variability of V1 neural responses changed systematically between cardinal and oblique patterns, confirming the involvement of feedback dynamics in the shaping of V1 responses.

The second paper (Lange et al. 2023) obtained a contradicting result: no evidence for significant trial-by-trial changes in correlated variability between cardinal and oblique patterns.

Lange et al. commented the following about the Bondy et al. study:

The most salient difference between the two studies is that whereas we trained monkeys to switch between tasks on a trial-by-trial basis, Bondy et al.’s monkeys performed only a single task per session, alternating between them by relearning each task over the course of several days. […] A second difference between our experimental design and that of Bondy et al. (41) is that our animals were not as extensively trained as theirs were at the time that neural recordings were made. […] Bondy et al., on the other hand, trained their animals extensively before neural recordings began: one of their animals (LEM) performed the same task in an earlier study (42) and thus had >2 yr of prior training; the second monkey (JBE) had >1 yr of experience with the task (A. Bondy, personal communication).

So, if we bridge the gap, it can be tentatively inferred from these highlights that there is some degree of redundancy in V1 as well, but that V1 is less plastic to feedback signals than V4.

If those findings require such extensive training, then I don’t think non-reward, passive viewing would reveal much for a non-human subject.

As to how to distinguish generative inference from an information-redundancy perspective, GI doesn’t really flesh out the details as well as TBT does. They mention “neurons represent beliefs about causes in the external world”, but it’s just very unclear to me how it all comes together. TBT has columnar voting, but GI doesn’t even present a consensus mechanism, and doesn’t talk about cortical columns at all, treating cortical regions as bags of neurons.

The technology of our time is unfortunately not advanced enough to accurately measure sufficient information at the individual column level. Utah arrays and electrodes are a bit of a crap shoot. Perhaps one day we will have micrometer-scale probes to capture per-column L1 signals…

1 Like

The monkey always knew that the next stimulus would be strictly one of two: “|” or “—” (“\” or “/”). It would be interesting to see how the redundancy changed as the number of expected stimuli gradually increased. If it didn’t change at all, then all this redundancy was simply a measurement artifact. And if they could see a linear or nonlinear relationship, that would be more rigorous proof. Alas, they didn’t do that.

1 Like