Advancing the Biological Plausibility and Efficacy of Hebbian Convolutional Neural Networks

Esther Mondragon; Julian Jimenez Nimmo

arxiv: 2501.17266 · v2 · submitted 2025-01-06 · 💻 cs.NE · cs.CV

Advancing the Biological Plausibility and Efficacy of Hebbian Convolutional Neural Networks

Julian Jimenez Nimmo , Esther Mondragon This is my paper

Pith reviewed 2026-05-23 06:13 UTC · model grok-4.3

classification 💻 cs.NE cs.CV

keywords Hebbian learningConvolutional neural networksWinner-takes-allBCM ruleLateral inhibitionImage classificationCIFAR-10Biological plausibility

0 comments

The pith

A Hebbian CNN using hard WTA, Gaussian lateral inhibition and BCM rule matches backpropagation accuracy on CIFAR-10 while beating prior hard-WTA models by 10.6 points.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests combinations of local learning rules and competition mechanisms inside convolutional networks to replace backpropagation with Hebbian updates that stay biologically local. It identifies one specific architecture that reaches the same test accuracy as a fully supervised backpropagation network on CIFAR-10. The same model also improves on earlier hard-WTA Hebbian CNNs of identical depth and shows progressive abstraction across layers. Performance remains competitive on MNIST and STL-10. The work therefore presents an existence proof that a biologically constrained unsupervised rule set can close the accuracy gap with end-to-end gradient descent on standard image benchmarks.

Core claim

Integrating hard Winner-Takes-All competition, Gaussian lateral inhibition, and the Bienenstock-Cooper-Munro learning rule inside convolutional layers produces an optimal Hebbian model whose mean accuracy on the last half of CIFAR-10 test epochs equals 75.2 percent, matching an end-to-end backpropagation counterpart and exceeding the previous state-of-the-art hard-WTA CNN result of 64.6 percent by 10.6 percentage points, while also reaching 98 percent on MNIST and 69.5 percent on STL-10 and displaying increasingly complex receptive fields.

What carries the argument

The single integrated architecture that places hard WTA competition together with Gaussian lateral inhibition and the BCM rule inside convolutional layers to drive local, unsupervised feature learning.

If this is right

The same architecture reaches 98 percent accuracy on MNIST and 69.5 percent on STL-10.
Feature maps exhibit sparse hierarchical structure with receptive fields that grow more abstract in deeper layers.
Learned representations improve both classification performance and generalisability compared with earlier hard-WTA Hebbian CNNs.
The approach narrows the performance gap between local unsupervised rules and end-to-end backpropagation on image tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result suggests that further scaling the same rule set to deeper or wider networks could be tested without introducing non-local signals.
The architecture may transfer to other sensory modalities where local competition and rate-based plasticity are plausible.
Direct comparison of receptive-field statistics between this model and primate visual cortex could be performed on the same stimuli.

Load-bearing premise

The specific combination of hard WTA, Gaussian lateral inhibition, and BCM rule can be considered biologically tenable while still producing the reported accuracy gains.

What would settle it

A controlled ablation on CIFAR-10 that removes any one of the three components (hard WTA, Gaussian inhibition, or BCM) from the optimal architecture and measures whether accuracy falls below 75.2 percent or the 10.6-point margin over prior hard-WTA CNNs disappears.

read the original abstract

The research presented in this paper advances the integration of Hebbian learning into Convolutional Neural Networks (CNNs) for image processing, systematically exploring different architectures to build an optimal configuration, adhering to biological tenability. Hebbian learning operates on local unsupervised neural information to form feature representations, providing an alternative to the popular but arguably biologically implausible and computationally intensive backpropagation learning algorithm. The suggested optimal architecture significantly enhances recent research aimed at integrating Hebbian learning with competition mechanisms and CNNs, expanding their representational capabilities by incorporating hard Winner-Takes-All (WTA) competition, Gaussian lateral inhibition mechanisms, and Bienenstock-Cooper-Munro (BCM) learning rule in a single model. Mean accuracy classification measures during the last half of test epochs on CIFAR-10 revealed that the resulting optimal model matched its end-to-end backpropagation variant with 75.2% each, critically surpassing the state-of-the-art hard-WTA performance in CNNs of the same network depth (64.6%) by 10.6%. It also achieved competitive performance on MNIST (98%) and STL-10 (69.5%). Moreover, results showed clear indications of sparse hierarchical learning through increasingly complex and abstract receptive fields. In summary, our implementation enhances both the performance and the generalisability of the learnt representations and constitutes a crucial step towards more biologically realistic artificial neural networks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

They combined hard WTA, Gaussian lateral inhibition and BCM in one Hebbian CNN and report it matching backprop at 75.2% on CIFAR-10 while beating the prior hard-WTA baseline by 10 points.

read the letter

The paper's main result is the performance number: a Hebbian CNN using that exact trio of mechanisms reaches the same accuracy as its backprop counterpart on CIFAR-10 and improves on the cited hard-WTA CNN at identical depth. It also gives numbers on MNIST and STL-10 plus a qualitative note on receptive-field complexity. That combination and the reported lift over the 64.6% baseline are what is new here; earlier Hebbian CNN work already used subsets of these pieces, so the advance is mainly the joint implementation and the fresh accuracy figures. The abstract is straightforward about the architecture and the evaluation protocol, which helps. The soft spots are the usual ones for an abstract-only view: no error bars, no statistical tests, no ablation tables, and no training-loop details. Those gaps matter because the central claim rests on the updates staying local and unsupervised; without the methods section it is impossible to confirm there is no hidden non-local information or extra regularization that is doing the heavy lifting. The biological-tenability argument is asserted rather than demonstrated with new evidence, but that is common in this line of work and does not break the empirical claim. Readers working on local learning rules or cortical models will find the architecture choices and the benchmark numbers useful to cite or replicate. The paper is coherent on its own terms and the numbers are specific enough to be checked, so it deserves a serious referee rather than a desk reject.

Referee Report

1 major / 0 minor

Summary. The manuscript presents a Hebbian CNN architecture that integrates hard Winner-Takes-All (WTA) competition, Gaussian lateral inhibition, and the Bienenstock-Cooper-Munro (BCM) rule. It claims this optimal configuration achieves 75.2% mean classification accuracy on CIFAR-10 (matching an end-to-end backpropagation baseline and exceeding prior hard-WTA CNNs of equivalent depth at 64.6% by 10.6%), with competitive results on MNIST (98%) and STL-10 (69.5%), while producing sparse hierarchical receptive fields and adhering to biological constraints.

Significance. If the performance parity with backpropagation is confirmed under controlled conditions, the work would advance the development of local, unsupervised learning rules for deep convolutional networks and strengthen the case for biologically tenable alternatives to backpropagation.

major comments (1)

Abstract: the reported 75.2% CIFAR-10 accuracy and 10.6% improvement over prior hard-WTA results are presented without training details, error bars, statistical tests, ablation controls, or verification that updates remain strictly local; a complete methods section is required to establish that these numbers support the central claim of matching backpropagation performance.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the need for supporting details behind the abstract claims. We address the single major comment below.

read point-by-point responses

Referee: Abstract: the reported 75.2% CIFAR-10 accuracy and 10.6% improvement over prior hard-WTA results are presented without training details, error bars, statistical tests, ablation controls, or verification that updates remain strictly local; a complete methods section is required to establish that these numbers support the central claim of matching backpropagation performance.

Authors: We agree that the abstract alone does not provide these supporting elements. The full manuscript already contains a Methods section that specifies the training protocol (including the strictly local nature of Hebbian, WTA, lateral inhibition, and BCM updates) and the experimental setup. To fully address the concern we will expand the Methods section with additional training hyperparameters, add error bars and statistical tests to the CIFAR-10 results, and include ablation controls comparing the full model against variants lacking individual components. The abstract will be lightly revised to point readers to the Methods section. These changes will be incorporated in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity; purely empirical performance claims

full rationale

The paper reports experimental results from training and testing Hebbian CNN variants on CIFAR-10, MNIST, and STL-10. Central claims are measured accuracies (75.2% matching backprop, 10.6% above prior hard-WTA). No equations, derivations, or predictions are presented that reduce to fitted inputs or self-citations by construction. The architecture search and rule combinations are described as design choices, with outcomes evaluated directly on benchmarks. This matches the default expectation of non-circular empirical work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit list of fitted parameters or axioms; typical ML hyperparameters such as learning rates and inhibition widths are likely present but unidentified here. No new entities postulated.

pith-pipeline@v0.9.0 · 5781 in / 1112 out tokens · 45951 ms · 2026-05-23T06:13:07.668187+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

incorporating hard Winner-Takes-All (WTA) competition, Gaussian lateral inhibition mechanisms, and Bienenstock-Cooper-Munro (BCM) learning rule in a single model... 75.2% on CIFAR-10
IndisputableMonolith/Foundation/DimensionForcing.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Hard-WTA... BCM... lateral inhibition... 3-CNN layer architecture

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.