SBD unifies broadcast credit assignment for losses including cross-entropy via output-score orthogonality to hidden activations and reports performance gains on CIFAR-10 and Tiny ImageNet.
Principled Training of Neural Networks with Direct Feedback Alignment
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
The backpropagation algorithm has long been the canonical training method for neural networks. Modern paradigms are implicitly optimized for it, and numerous guidelines exist to ensure its proper use. Recently, synthetic gradients methods -where the error gradient is only roughly approximated - have garnered interest. These methods not only better portray how biological brains are learning, but also open new computational possibilities, such as updating layers asynchronously. Even so, they have failed to scale past simple tasks like MNIST or CIFAR-10. This is in part due to a lack of standards, leading to ill-suited models and practices forbidding such methods from performing to the best of their abilities. In this work, we focus on direct feedback alignment and present a set of best practices justified by observations of the alignment angles. We characterize a bottleneck effect that prevents alignment in narrow layers, and hypothesize it may explain why feedback alignment methods have yet to scale to large convolutional networks.
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
Mono-Forward replaces Forward-Forward's contrastive goodness with local multi-class cross-entropy, outperforming vanilla FF and sometimes backpropagation while using 31% of its memory on MLP-Mixers for PathMNIST.
citing papers explorer
-
Score Broadcast and Decorrelation: A General Framework for Broadcast-Based Credit Assignment
SBD unifies broadcast credit assignment for losses including cross-entropy via output-score orthogonality to hidden activations and reports performance gains on CIFAR-10 and Tiny ImageNet.
-
Mono-Forward: Revisiting Forward-Forward through Objective-Locality Decomposition
Mono-Forward replaces Forward-Forward's contrastive goodness with local multi-class cross-entropy, outperforming vanilla FF and sometimes backpropagation while using 31% of its memory on MLP-Mixers for PathMNIST.