Score Broadcast and Decorrelation: A General Framework for Broadcast-Based Credit Assignment

Alper T. Erdogan; Cengiz Pehlevan; Mete Erdogan; Mustafa Uzun

arxiv: 2605.30638 · v1 · pith:JC7PQEM7new · submitted 2026-05-28 · 💻 cs.LG · cs.AI

Score Broadcast and Decorrelation: A General Framework for Broadcast-Based Credit Assignment

Mustafa Uzun , Mete Erdogan , Cengiz Pehlevan , Alper T. Erdogan This is my paper

Pith reviewed 2026-06-29 08:25 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords broadcast credit assignmenterror broadcastorthogonality principlethree-factor learningscore vector expansiondifferentiable lossescross-entropy

0 comments

The pith

An orthogonality principle between output scores and hidden activations unifies broadcast-based credit assignment for general differentiable losses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a single orthogonality condition between the loss score at the output and activations in hidden layers holds whenever the optimal score has conditional mean zero given the input. This condition extends the earlier EBD framework from mean-squared error to unify broadcast credit assignment across families including cross-entropy, Bregman divergences, proper scoring rules, and exponential-family negative log-likelihoods. A sympathetic reader would care because it supplies a unified theoretical basis for three-factor learning rules that use the loss score as the broadcast neuromodulatory signal, while also enabling score vector expansion to enrich the decorrelation objective.

Core claim

The central claim is that the orthogonality between the output score (the gradient of the loss with respect to the final-layer output) and hidden-layer activations, which follows from the optimal score having conditional mean zero, supplies the theoretical grounding for broadcast-based credit assignment across standard differentiable-loss families. The framework derives the cross-entropy case explicitly, characterizes the admissible loss class, and introduces score vector expansion to enrich the broadcast signal while preserving the orthogonality framework.

What carries the argument

The orthogonality principle between the output score and hidden-layer activations.

If this is right

The same orthogonality unifies broadcast credit assignment for cross-entropy, Bregman divergences, proper scoring rules, and exponential-family negative log-likelihoods.
It grounds the three-factor learning rule with the neuromodulatory factor derived as the broadcast loss score.
Score vector expansion enriches the broadcast signal while preserving the orthogonality framework.
The resulting method substantially improves performance over existing broadcast approaches on CIFAR-10 and Tiny ImageNet.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could apply directly to other differentiable losses that satisfy the conditional mean zero property even if they are not listed.
The same orthogonality check might serve as a diagnostic for whether a given loss family admits broadcast credit assignment.
Score vector expansion could be combined with other decorrelation objectives outside the broadcast setting.
Empirical verification of the conditional mean zero property on trained networks would test a key step in the derivation.

Load-bearing premise

The optimal score has conditional mean zero given the input.

What would settle it

An observation that the optimal score lacks conditional mean zero for a loss in the claimed class, or that the broadcast rule derived from the orthogonality fails to produce the expected decorrelation on a cross-entropy task.

Figures

Figures reproduced from arXiv: 2605.30638 by Alper T. Erdogan, Cengiz Pehlevan, Mete Erdogan, Mustafa Uzun.

**Figure 2.** Figure 2: Poisson proof-of-concept: test excess negative log-likelihood relative to the Bayes oracle, [PITH_FULL_IMAGE:figures/full_fig_p020_2.png] view at source ↗

**Figure 3.** Figure 3: Poisson proof-of-concept: binned conditional-mean-zero metric versus epoch on a logarith [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗

**Figure 4.** Figure 4: Training and test accuracy curves on CIFAR-10 for BP, DFA, SBD, and SBD with [PITH_FULL_IMAGE:figures/full_fig_p031_4.png] view at source ↗

**Figure 5.** Figure 5: Training and test accuracy curves on CIFAR-10 for BP and SBD with score expansion [PITH_FULL_IMAGE:figures/full_fig_p032_5.png] view at source ↗

**Figure 6.** Figure 6: Cosine similarity between the local SBD gradient and the diagnostic BP gradient for each [PITH_FULL_IMAGE:figures/full_fig_p034_6.png] view at source ↗

**Figure 7.** Figure 7: Training accuracy on Tiny ImageNet for BackProp, SBD, and SBD with score expansion [PITH_FULL_IMAGE:figures/full_fig_p038_7.png] view at source ↗

**Figure 8.** Figure 8: Test accuracy on Tiny ImageNet for BackProp, SBD, and SBD with score expansion [PITH_FULL_IMAGE:figures/full_fig_p038_8.png] view at source ↗

**Figure 9.** Figure 9: Training and test accuracy on Tiny ImageNet for DFA across 500 epochs. Each curve [PITH_FULL_IMAGE:figures/full_fig_p039_9.png] view at source ↗

read the original abstract

We introduce Score Broadcast and Decorrelation (SBD), a principled framework for broadcast-based credit assignment for general families of differentiable losses. Error broadcast is a biologically plausible alternative to backpropagation that sends output information to hidden layers without weight transport. The Error Broadcast and Decorrelation (EBD) framework, recently introduced for the mean-squared-error (MSE) setting, grounded this mechanism in the stochastic orthogonality of optimal estimators, under which the optimal residual is orthogonal to functions of the input. We generalize that foundation by introducing an orthogonality principle between the output score (the gradient of loss with respect to the final-layer output) and hidden-layer activations, which holds whenever the optimal score has conditional mean zero. This single principle unifies broadcast-based credit assignment across the standard differentiable-loss families, including cross-entropy, Bregman divergences, proper scoring rules, and exponential-family negative log-likelihoods. The framework supplies a theoretical grounding for the three-factor learning rule under general losses, with the neuromodulatory factor derived as the broadcast loss score. We derive the cross-entropy case explicitly, characterize the admissible loss class, and introduce a score vector expansion technique that enriches the broadcast signal while preserving the orthogonality framework. Experiments on CIFAR-10 and Tiny ImageNet show that SBD substantially improves over existing broadcast approaches, with score vector expansion delivering further gains. Overall, this work identifies the loss score as the signal to broadcast, supplies the orthogonality theory and theoretical grounding for the three-factor learning rule from neuroscience, and shows how score vector expansion enriches the decorrelation directions of the resulting objective.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SBD generalizes the EBD orthogonality from MSE residuals to loss scores for a wider set of differentiable losses, with the conditional-mean-zero property carrying over directly.

read the letter

The main takeaway is that this paper replaces the residual in the earlier EBD work with the gradient of the loss (the score) and shows that the same stochastic orthogonality to hidden activations still holds as long as the optimal score has conditional mean zero given the input. That single step unifies the broadcast mechanism across cross-entropy, Bregman divergences, proper scoring rules, and exponential-family NLL without needing new assumptions.

The derivation is straightforward from the law of total expectation once you accept the mean-zero property, which is standard for those loss families at optimality. The score-vector-expansion trick is presented as a practical addition that keeps the orthogonality while giving richer directions for decorrelation. Both moves look like genuine extensions of the cited EBD framework rather than re-labeling.

The experiments are described only at a high level in the abstract, claiming clear gains on CIFAR-10 and Tiny ImageNet over prior broadcast methods, but without numbers or statistical detail here it is hard to judge effect size. If the full paper supplies proper baselines and controls, that would strengthen the practical side.

This is aimed at people working on biologically plausible credit assignment and three-factor rules. The theoretical part is clean enough that a referee could check the derivations in a few hours and decide whether the unification is as general as claimed. I would send it out for review rather than desk-reject.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces Score Broadcast and Decorrelation (SBD) as a generalization of the Error Broadcast and Decorrelation (EBD) framework from the MSE setting to arbitrary differentiable losses. It posits a single orthogonality principle between the output score (gradient of the loss w.r.t. the final output) and hidden-layer activations, which is claimed to hold whenever the optimal score has conditional mean zero given the input. This principle is asserted to unify broadcast-based credit assignment across cross-entropy, Bregman divergences, proper scoring rules, and exponential-family negative log-likelihoods; the paper derives the cross-entropy case explicitly, characterizes the admissible loss class, introduces a score-vector expansion technique that preserves orthogonality, supplies a theoretical grounding for the three-factor learning rule, and reports that SBD with score-vector expansion yields substantial gains on CIFAR-10 and Tiny ImageNet.

Significance. If the orthogonality derivation and unification hold without additional assumptions, the work supplies a principled, loss-agnostic foundation for biologically plausible credit assignment that extends beyond MSE-specific results. The explicit identification of the loss score as the broadcast signal and the score-vector expansion method would constitute a concrete advance for both theoretical understanding of three-factor rules and practical algorithm design in settings that forbid weight transport.

minor comments (2)

The abstract asserts that experiments demonstrate substantial improvement and further gains from score-vector expansion, yet supplies neither quantitative deltas, baseline comparisons, nor statistical details; the results section should include these to allow readers to assess the magnitude and reliability of the reported gains.
The admissible loss class is characterized in the manuscript; a brief explicit statement of the necessary and sufficient conditions (e.g., in terms of the conditional-mean-zero property) would improve clarity for readers outside the immediate subfield.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the significance of the orthogonality principle, and recommendation of minor revision. No specific major comments were provided in the report, so we have no individual points to rebut. We will address any minor editorial or clarification suggestions in the revised version.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The central orthogonality principle follows from the law of total expectation once E[score|x]=0 is granted at optimality; this is a standard property of optimal estimators for cross-entropy, Bregman divergences, proper scoring rules, and exponential-family NLL, and is not obtained by fitting inside the paper or by self-citation. The unification across loss families and the three-factor rule grounding are direct consequences of this external statistical fact rather than any reduction to the paper's own inputs. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the provided derivation outline.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the stochastic orthogonality property of optimal estimators and the conditional-mean-zero property of the optimal score; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption The optimal score has conditional mean zero given the input.
Invoked to establish orthogonality between output score and hidden activations (abstract paragraph on generalization of EBD).

pith-pipeline@v0.9.1-grok · 5834 in / 1382 out tokens · 21068 ms · 2026-06-29T08:25:29.981790+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 4 canonical work pages · 3 internal anchors

[1]

Rumelhart, Geoffrey E

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. Nature, 323:533--536, 1986

1986
[2]

The recent excitement about neural networks

Francis Crick. The recent excitement about neural networks. Nature, 337:129--132, 1989

1989
[3]

Lillicrap, Adam Santoro, Luke Marris, Colin J

Timothy P. Lillicrap, Adam Santoro, Luke Marris, Colin J. Akerman, and Geoffrey Hinton. Backpropagation and the brain. Nature Reviews Neuroscience, 21(6):335--346, 2020

2020
[4]

Humphreys, Timothy Lillicrap, and Douglas Tweed

Mohamed Akrout, Collin Wilson, Peter C. Humphreys, Timothy Lillicrap, and Douglas Tweed. Deep learning without weight transport. In Advances in Neural Information Processing Systems 32 (NeurIPS) , pages 974--982, 2019

2019
[5]

James C. R. Whittington and Rafal Bogacz. Theories of error back-propagation in the brain. Trends in Cognitive Sciences, 23(3):235--250, 2019

2019
[6]

Golkar, T

S. Golkar, T. Tesileanu, Y. Bahroun, A. Sengupta, and D. Chklovskii. Constrained predictive coding as a biologically plausible model of the cortical hierarchy. Advances in Neural Information Processing Systems 35 (NeurIPS) , pages 14155--14169, 2022

2022
[7]

Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism

Mete Erdogan, Cengiz Pehlevan, and Alper Tunga Erdogan. Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism. In Advances in Neural Information Processing Systems 38 (NeurIPS), 2025

2025
[8]

Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules

Nicolas Fr \'e maux and Wulfram Gerstner. Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules. Frontiers in Neural Circuits , 9:85, 2016

2016
[9]

Eligibility traces and plasticity on behavioral time scales: experimental support of neo- H ebbian three-factor learning rules

Wulfram Gerstner, Marco Lehmann, Vasiliki Liakoni, Dane Corneil, and Johanni Brea. Eligibility traces and plasticity on behavioral time scales: experimental support of neo- H ebbian three-factor learning rules. Frontiers in Neural Circuits , 12:53, 2018

2018
[10]

Learning with three factors: modulating H ebbian plasticity with errors

ukasz Ku \'s mierz, Takuya Isomura, and Taro Toyoizumi. Learning with three factors: modulating H ebbian plasticity with errors. Current Opinion in Neurobiology , 46:170--177, 2017

2017
[11]

Predictive reward signal of dopamine neurons

Wolfram Schultz. Predictive reward signal of dopamine neurons. Journal of Neurophysiology , 80(1):1--27, 1998

1998
[12]

Self-Supervised Learning with an Information Maximization Criterion

Serdar Ozsoy, Shadi Hamdan, Sercan Arik, Deniz Yuret, and Alper Erdogan. Self-Supervised Learning with an Information Maximization Criterion. In Advances in Neural Information Processing Systems 35 (NeurIPS), pages 35240--35253, 2022

2022
[13]

Bariscan Bozkurt, Cengiz Pehlevan, and Alper T. Erdogan. Correlative Information Maximization: A Biologically Plausible Approach to Supervised Deep Neural Networks without Weight Symmetry. In Advances in Neural Information Processing Systems 36 (NeurIPS), pages 34928--34941, 2023

2023
[14]

Bariscan Bozkurt, Ate s \.I sfendiyaro g lu, Cengiz Pehlevan, and Alper T. Erdogan. Correlative Information Maximization Based Biologically Plausible Neural Networks for Correlated Source Separation. In International Conference on Learning Representations (ICLR), 2023

2023
[15]

Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation

Benjamin Scellier and Yoshua Bengio. Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation. Frontiers in Computational Neuroscience, 11:24, 2017

2017
[16]

Equivalence of equilibrium propagation and recurrent backpropagation

Benjamin Scellier and Yoshua Bengio. Equivalence of equilibrium propagation and recurrent backpropagation. Neural Computation , 31(2):312--329, 2019

2019
[17]

How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation

Yoshua Bengio. How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation. CoRR, abs/1407.7906, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[18]

Difference Target Propagation

Dong-Hyun Lee, Saizheng Zhang, Asja Fischer, and Yoshua Bengio. Difference Target Propagation. In Machine Learning and Knowledge Discovery in Databases, pages 498--515, 2015

2015
[19]

Geoffrey E. Hinton. The Forward-Forward Algorithm: Some Preliminary Investigations. CoRR, abs/2212.13345, 2022

work page arXiv 2022
[20]

Rajesh P. N. Rao and Dana H. Ballard. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2:79--87, 1999

1999
[21]

James C. R. Whittington and Rafal Bogacz. An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Plasticity. Neural Computation, 29(5):1229--1262, 2017

2017
[22]

Contrastive Similarity Matching for Supervised Learning

Shanshan Qin, Nayantara Mudur, and Cengiz Pehlevan. Contrastive Similarity Matching for Supervised Learning. Neural Computation, 33(5):1300--1328, 2021

2021
[23]

Lillicrap, Daniel Cownden, Douglas B

Timothy P. Lillicrap, Daniel Cownden, Douglas B. Tweed, and Colin J. Akerman. Random synaptic feedback weights support error backpropagation for deep learning. Nature Communications, 7:13276, 2016

2016
[24]

Direct Feedback Alignment Provides Learning in Deep Neural Networks

Arild N kland. Direct Feedback Alignment Provides Learning in Deep Neural Networks. In Advances in Neural Information Processing Systems 29 (NeurIPS), pages 1037--1045, 2016

2016
[25]

Hinton, and Timothy Lillicrap

Sergey Bartunov, Adam Santoro, Blake Richards, Luke Marris, Geoffrey E. Hinton, and Timothy Lillicrap. Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures. In Advances in Neural Information Processing Systems 31 (NeurIPS), 2018

2018
[26]

Efficient Convolutional Neural Network Training with Direct Feedback Alignment

Donghyeon Han and Hoi-jun Yoo. Efficient Convolutional Neural Network Training with Direct Feedback Alignment. CoRR, abs/1901.01986, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901
[27]

Principled Training of Neural Networks with Direct Feedback Alignment

Julien Launay, Iacopo Poli, and Florent Krzakala. Principled Training of Neural Networks with Direct Feedback Alignment. CoRR, abs/1906.04554, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906
[28]

Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures

Julien Launay, Iacopo Poli, François Boniface, and Florent Krzakala. Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures. In Advances in Neural Information Processing Systems 33 (NeurIPS), 2020

2020
[29]

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Blake Bordelon and Cengiz Pehlevan. The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks. In International Conference on Learning Representations (ICLR), 2023

2023
[30]

Clark, L

David G. Clark, L. F. Abbott, and SueYeon Chung. Credit Assignment Through Broadcasting a Global Error Vector. In Advances in Neural Information Processing Systems 34 (NeurIPS), 2021

2021
[31]

Analysis synthesis telephony based on the maximum likelihood method

Fumitada Itakura and Satoshi Saito. Analysis synthesis telephony based on the maximum likelihood method. In Proceedings of the 6th International Congress on Acoustics, pages C--17--C--20. IEEE, 1968

1968
[32]

F\' e votte, N

C. F\' e votte, N. Bertin, and J.-L. Durrieu. Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Computation, 21(3):793--830, 2009

2009
[33]

L. M. Bregman. The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics, 7(3):200--217, 1967

1967
[34]

Tilmann Gneiting and Adrian E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association , 102(477):359--378, 2007

2007
[35]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Perfo...

2019
[36]

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images. Technical Report, University of Toronto, 2009

2009
[37]

Tiny ImageNet Visual Recognition Challenge

Ya Le and Xuan Yang. Tiny ImageNet Visual Recognition Challenge. CS 231N course project report, Stanford University, 2015

2015
[38]

Friedman

Jerome H. Friedman. Multivariate adaptive regression splines. The Annals of Statistics , 19(1):1--67, 1991

1991
[39]

Experiment tracking with Weights and Biases

Lukas Biewald. Experiment tracking with Weights and Biases. Software available from wandb.com, 2020

2020

[1] [1]

Rumelhart, Geoffrey E

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. Nature, 323:533--536, 1986

1986

[2] [2]

The recent excitement about neural networks

Francis Crick. The recent excitement about neural networks. Nature, 337:129--132, 1989

1989

[3] [3]

Lillicrap, Adam Santoro, Luke Marris, Colin J

Timothy P. Lillicrap, Adam Santoro, Luke Marris, Colin J. Akerman, and Geoffrey Hinton. Backpropagation and the brain. Nature Reviews Neuroscience, 21(6):335--346, 2020

2020

[4] [4]

Humphreys, Timothy Lillicrap, and Douglas Tweed

Mohamed Akrout, Collin Wilson, Peter C. Humphreys, Timothy Lillicrap, and Douglas Tweed. Deep learning without weight transport. In Advances in Neural Information Processing Systems 32 (NeurIPS) , pages 974--982, 2019

2019

[5] [5]

James C. R. Whittington and Rafal Bogacz. Theories of error back-propagation in the brain. Trends in Cognitive Sciences, 23(3):235--250, 2019

2019

[6] [6]

Golkar, T

S. Golkar, T. Tesileanu, Y. Bahroun, A. Sengupta, and D. Chklovskii. Constrained predictive coding as a biologically plausible model of the cortical hierarchy. Advances in Neural Information Processing Systems 35 (NeurIPS) , pages 14155--14169, 2022

2022

[7] [7]

Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism

Mete Erdogan, Cengiz Pehlevan, and Alper Tunga Erdogan. Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism. In Advances in Neural Information Processing Systems 38 (NeurIPS), 2025

2025

[8] [8]

Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules

Nicolas Fr \'e maux and Wulfram Gerstner. Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules. Frontiers in Neural Circuits , 9:85, 2016

2016

[9] [9]

Eligibility traces and plasticity on behavioral time scales: experimental support of neo- H ebbian three-factor learning rules

Wulfram Gerstner, Marco Lehmann, Vasiliki Liakoni, Dane Corneil, and Johanni Brea. Eligibility traces and plasticity on behavioral time scales: experimental support of neo- H ebbian three-factor learning rules. Frontiers in Neural Circuits , 12:53, 2018

2018

[10] [10]

Learning with three factors: modulating H ebbian plasticity with errors

ukasz Ku \'s mierz, Takuya Isomura, and Taro Toyoizumi. Learning with three factors: modulating H ebbian plasticity with errors. Current Opinion in Neurobiology , 46:170--177, 2017

2017

[11] [11]

Predictive reward signal of dopamine neurons

Wolfram Schultz. Predictive reward signal of dopamine neurons. Journal of Neurophysiology , 80(1):1--27, 1998

1998

[12] [12]

Self-Supervised Learning with an Information Maximization Criterion

Serdar Ozsoy, Shadi Hamdan, Sercan Arik, Deniz Yuret, and Alper Erdogan. Self-Supervised Learning with an Information Maximization Criterion. In Advances in Neural Information Processing Systems 35 (NeurIPS), pages 35240--35253, 2022

2022

[13] [13]

Bariscan Bozkurt, Cengiz Pehlevan, and Alper T. Erdogan. Correlative Information Maximization: A Biologically Plausible Approach to Supervised Deep Neural Networks without Weight Symmetry. In Advances in Neural Information Processing Systems 36 (NeurIPS), pages 34928--34941, 2023

2023

[14] [14]

Bariscan Bozkurt, Ate s \.I sfendiyaro g lu, Cengiz Pehlevan, and Alper T. Erdogan. Correlative Information Maximization Based Biologically Plausible Neural Networks for Correlated Source Separation. In International Conference on Learning Representations (ICLR), 2023

2023

[15] [15]

Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation

Benjamin Scellier and Yoshua Bengio. Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation. Frontiers in Computational Neuroscience, 11:24, 2017

2017

[16] [16]

Equivalence of equilibrium propagation and recurrent backpropagation

Benjamin Scellier and Yoshua Bengio. Equivalence of equilibrium propagation and recurrent backpropagation. Neural Computation , 31(2):312--329, 2019

2019

[17] [17]

How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation

Yoshua Bengio. How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation. CoRR, abs/1407.7906, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[18] [18]

Difference Target Propagation

Dong-Hyun Lee, Saizheng Zhang, Asja Fischer, and Yoshua Bengio. Difference Target Propagation. In Machine Learning and Knowledge Discovery in Databases, pages 498--515, 2015

2015

[19] [19]

Geoffrey E. Hinton. The Forward-Forward Algorithm: Some Preliminary Investigations. CoRR, abs/2212.13345, 2022

work page arXiv 2022

[20] [20]

Rajesh P. N. Rao and Dana H. Ballard. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2:79--87, 1999

1999

[21] [21]

James C. R. Whittington and Rafal Bogacz. An Approximation of the Error Backpropagation Algorithm in a Predictive Coding Network with Local Hebbian Synaptic Plasticity. Neural Computation, 29(5):1229--1262, 2017

2017

[22] [22]

Contrastive Similarity Matching for Supervised Learning

Shanshan Qin, Nayantara Mudur, and Cengiz Pehlevan. Contrastive Similarity Matching for Supervised Learning. Neural Computation, 33(5):1300--1328, 2021

2021

[23] [23]

Lillicrap, Daniel Cownden, Douglas B

Timothy P. Lillicrap, Daniel Cownden, Douglas B. Tweed, and Colin J. Akerman. Random synaptic feedback weights support error backpropagation for deep learning. Nature Communications, 7:13276, 2016

2016

[24] [24]

Direct Feedback Alignment Provides Learning in Deep Neural Networks

Arild N kland. Direct Feedback Alignment Provides Learning in Deep Neural Networks. In Advances in Neural Information Processing Systems 29 (NeurIPS), pages 1037--1045, 2016

2016

[25] [25]

Hinton, and Timothy Lillicrap

Sergey Bartunov, Adam Santoro, Blake Richards, Luke Marris, Geoffrey E. Hinton, and Timothy Lillicrap. Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures. In Advances in Neural Information Processing Systems 31 (NeurIPS), 2018

2018

[26] [26]

Efficient Convolutional Neural Network Training with Direct Feedback Alignment

Donghyeon Han and Hoi-jun Yoo. Efficient Convolutional Neural Network Training with Direct Feedback Alignment. CoRR, abs/1901.01986, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901

[27] [27]

Principled Training of Neural Networks with Direct Feedback Alignment

Julien Launay, Iacopo Poli, and Florent Krzakala. Principled Training of Neural Networks with Direct Feedback Alignment. CoRR, abs/1906.04554, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906

[28] [28]

Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures

Julien Launay, Iacopo Poli, François Boniface, and Florent Krzakala. Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures. In Advances in Neural Information Processing Systems 33 (NeurIPS), 2020

2020

[29] [29]

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Blake Bordelon and Cengiz Pehlevan. The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks. In International Conference on Learning Representations (ICLR), 2023

2023

[30] [30]

Clark, L

David G. Clark, L. F. Abbott, and SueYeon Chung. Credit Assignment Through Broadcasting a Global Error Vector. In Advances in Neural Information Processing Systems 34 (NeurIPS), 2021

2021

[31] [31]

Analysis synthesis telephony based on the maximum likelihood method

Fumitada Itakura and Satoshi Saito. Analysis synthesis telephony based on the maximum likelihood method. In Proceedings of the 6th International Congress on Acoustics, pages C--17--C--20. IEEE, 1968

1968

[32] [32]

F\' e votte, N

C. F\' e votte, N. Bertin, and J.-L. Durrieu. Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Computation, 21(3):793--830, 2009

2009

[33] [33]

L. M. Bregman. The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Computational Mathematics and Mathematical Physics, 7(3):200--217, 1967

1967

[34] [34]

Tilmann Gneiting and Adrian E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association , 102(477):359--378, 2007

2007

[35] [35]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Perfo...

2019

[36] [36]

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images. Technical Report, University of Toronto, 2009

2009

[37] [37]

Tiny ImageNet Visual Recognition Challenge

Ya Le and Xuan Yang. Tiny ImageNet Visual Recognition Challenge. CS 231N course project report, Stanford University, 2015

2015

[38] [38]

Friedman

Jerome H. Friedman. Multivariate adaptive regression splines. The Annals of Statistics , 19(1):1--67, 1991

1991

[39] [39]

Experiment tracking with Weights and Biases

Lukas Biewald. Experiment tracking with Weights and Biases. Software available from wandb.com, 2020

2020