Equivariance and Augmentation for Bayesian Neural Networks

Axel Flinth; Jan E. Gerken; Miaowen Dong

arxiv: 2606.26273 · v1 · pith:HMYQOFWAnew · submitted 2026-06-24 · 💻 cs.LG

Equivariance and Augmentation for Bayesian Neural Networks

Miaowen Dong , Axel Flinth , Jan E. Gerken This is my paper

Pith reviewed 2026-06-26 01:47 UTC · model grok-4.3

classification 💻 cs.LG

keywords equivariancedata augmentationBayesian neural networksvariational inferenceexponential familysymmetrizationorbit expansion

0 comments

The pith

Data augmentation reaches exact equivariance in Bayesian neural networks when variational distributions belong to the exponential family.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether symmetries can be learned from augmented data rather than imposed through network architecture. It focuses on Bayesian neural networks trained via variational inference and shows that exact equivariance becomes possible under specific conditions on the variational distribution. Bounds on the remaining equivariance error are derived, and three new symmetrization methods are introduced to strengthen the effect of augmentation. Experiments indicate that one of these methods, orbit expansion, improves both equivariance and task performance over standard augmentation.

Core claim

For variational distributions in the exponential family, data augmentation yields exact equivariance in Bayesian neural networks trained with variational inference; the paper derives the necessary conditions, supplies bounds on the equivariance error, and presents three symmetrization techniques that amplify the augmentation effect, with orbit expansion shown to outperform the baseline in numerical tests.

What carries the argument

Conditions on exponential-family variational distributions that turn data augmentation into exact equivariance, together with three symmetrization techniques (including orbit expansion) that reduce equivariance error.

If this is right

Equivariance can be obtained without modifying the network architecture.
The three symmetrization techniques provide concrete ways to reduce equivariance error beyond plain augmentation.
Orbit expansion yields measurable gains in both symmetry and predictive performance on the tested tasks.
The derived bounds quantify how far a given augmentation scheme remains from exact equivariance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same exponential-family argument might extend to other approximate inference methods that admit closed-form updates.
If orbit expansion scales to larger models, it could reduce the need for hand-crafted equivariant layers in scientific applications.
The error bounds could be used to decide when augmentation alone is sufficient versus when architectural symmetry is still required.

Load-bearing premise

The variational distributions used in the Bayesian neural networks must belong to the exponential family.

What would settle it

A direct numerical check that, for a variational distribution outside the exponential family, the same augmentation procedure fails to produce exact equivariance even after symmetrization.

Figures

Figures reproduced from arXiv: 2606.26273 by Axel Flinth, Jan E. Gerken, Miaowen Dong.

**Figure 1.** Figure 1: Natural parameters 𝜂 for a variational distribution in the exponential family that lie in 𝐻𝐺 correspond to symmetric BNNs, here exemplified with a reflection symmetry. Our main Theorem 3.7 implies that 𝐻𝐺 is invariant for augmented training. Through the symmetrization strategies described in Section 3.4, we can increase the equivariance of the final model. posterior, in contrast to the one training run per… view at source ↗

**Figure 2.** Figure 2: Two specializations of the orbit averaging ( [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Filter arrangement for orbit expansion under [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Empirical validation of Theorems 3.8 and 3.9. (a) The empirical equivariance defect decreases with 𝑁0 under both invariant and random Gaussian priors, and the two curves converge. (b) K-fold standard deviation of the Monte Carlo estimate bΔ eq 𝐹 (𝜂;𝑇) across 𝐾 = 10 independent runs at each 𝑇. The decay matches the O (1/ √ 𝑇) rate predicted by Theorem 3.9 (dotted reference line, slope −0.5). (c) The trainin… view at source ↗

**Figure 5.** Figure 5: Three filter arrangements for orbit expansion under [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗

**Figure 6.** Figure 6: Two ways of choosing the intermediate representations acting on the filter banks. All of the [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗

**Figure 7.** Figure 7: Trajectories for three variational families under [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗

**Figure 8.** Figure 8: Accuracy versus equivariance across methods, optimizers, and trigger timings (FashionMNIST, 𝑁0 = 5000, 𝐶4; best-accuracy checkpoint, mean over 5 seeds). The 𝑥-axis is test accuracy and the 𝑦-axis is symmetric KL divergence on a log scale, so the best region is the lower right (more accurate and more equivariant). Hue encodes the method: rose for geometric averaging, blue for projection, amber for orbit ex… view at source ↗

**Figure 9.** Figure 9: Training trajectories under projection vs. geometric averaging with early vs. late triggers (FashionMNIST, 𝑁0 = 5000, 𝐶4, SGD; single seed, representative of the 5-seed runs in [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗

**Figure 10.** Figure 10: Drift out of 𝐻𝐺 under SGD vs. AdamW for orbit expansion, at two expansion times (FashionMNIST, 𝑁0 = 5000, 𝐶4; mean over 5 seeds, shaded bands ±1 s.d.). Left: Stage-1 = 20 epochs; Right: Stage-1 = 100 epochs. Before the dotted line both optimizers train the same width-1/|𝐺| base network and their curves coincide. At expansion the symmetric KL of both runs drops to the same near-zero floor. During Stage 2 t… view at source ↗

read the original abstract

Symmetries are important for many deep learning tasks, ranging from applications in the sciences to medical imaging. However, there is an ongoing debate about whether to impose symmetry constraints on the neural network architecture (yielding equivariant neural networks) or learn them from augmented training data. Although equivariant networks are well-studied theoretically, much less is known about data augmentation, since analyzing augmentation requires control over the training dynamics. Inspired by recent results that show that augmented infinite deep ensembles are exactly equivariant, we study data augmentation for Bayesian neural networks (BNNs) trained with variational inference. We focus on variational distributions in the exponential family and derive conditions under which exact equivariance is reached. We furthermore obtain bounds on the equivariance error and introduce three novel symmetrization techniques which boost the effect of data augmentation in this setting. We conduct extensive numerical experiments which show that one of our symmetrization methods (orbit expansion) outperforms the baseline in both equivariance and overall performance. Our code is available at github.com/dmw1998/augment-BNNs

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper derives conditions for exact equivariance via augmentation in exponential-family variational BNNs, plus error bounds and three symmetrization methods, with orbit expansion working better in experiments.

read the letter

The core result is that for BNNs trained with variational inference where the variational distribution belongs to the exponential family, data augmentation can produce exact equivariance under derived conditions, along with bounds on the remaining error and three symmetrization techniques. One of them, orbit expansion, beats baselines on both equivariance and task performance in the reported tests, and the code is released.

What the paper does is take the known exact-equivariance result for infinite augmented ensembles and adapt it to the finite VI setting with explicit conditions and bounds. That scoping to the exponential family is stated clearly up front, which keeps the claims honest. The experiments provide a concrete check on one technique, and releasing the code lets others verify the implementation.

The main limitation is the exponential-family restriction itself. Many practical BNN approximations use distributions outside that class, so the exact-equivariance guarantee does not automatically carry over. It is also not obvious from the abstract how much the three symmetrization methods differ from standard augmentation tricks already in the literature; they may be incremental rather than wholly new. The bounds are derived but their practical tightness is not explored in depth here.

This is useful for researchers who already work with exponential-family variational posteriors and care about symmetries in imaging or scientific data. It has enough formal derivation and empirical grounding to merit a serious referee, even if the scope stays narrow. I would send it to peer review.

Referee Report

0 major / 2 minor

Summary. The manuscript studies data augmentation for Bayesian neural networks trained via variational inference. Focusing on variational distributions belonging to the exponential family, it derives conditions for exact equivariance under data augmentation, obtains bounds on the equivariance error, and introduces three novel symmetrization techniques. Experiments indicate that the orbit expansion technique outperforms baselines in both equivariance and overall performance. Code is made available.

Significance. If the derivations hold, the work extends results on exact equivariance in infinite ensembles to the variational inference setting for BNNs, providing scoped theoretical conditions, error bounds, and practical symmetrization methods. The explicit restriction to the exponential family and the availability of code for reproducibility are strengths that support verification and potential adoption in symmetry-aware probabilistic modeling.

minor comments (2)

Abstract: while the three symmetrization techniques are introduced, only orbit expansion is named; briefly listing the other two would improve clarity on the contributions.
The experimental section would benefit from explicit summary of datasets, metrics, and baseline details in the abstract or early introduction to allow readers to assess the outperformance claim without reading the full results.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript, recognition of the theoretical contributions on exact equivariance conditions and error bounds for augmented BNNs under variational inference (restricted to the exponential family), and the recommendation for minor revision. The acknowledgment of the code availability and potential for adoption in symmetry-aware modeling is appreciated.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained under stated assumptions

full rationale

The paper explicitly scopes its analysis to variational distributions belonging to the exponential family, then derives equivariance conditions, error bounds, and three symmetrization techniques directly from that restriction and the data-augmentation setup. The cited inspiration from infinite-ensemble results functions only as motivation and is not used to define or force any of the new bounds or techniques. No self-citations are load-bearing, no parameters fitted to data are relabeled as predictions, and no uniqueness theorems or ansatzes are smuggled in via prior author work. The experimental validation of orbit expansion is independent of the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on the assumption that variational distributions are members of the exponential family and on the existence of the cited prior result about augmented infinite ensembles.

axioms (1)

domain assumption Variational distributions belong to the exponential family
Explicitly stated as the focus of the derivation in the abstract.

invented entities (1)

orbit expansion symmetrization technique no independent evidence
purpose: Boost the effect of data augmentation toward exact equivariance
Presented as one of three novel techniques introduced in the paper

pith-pipeline@v0.9.1-grok · 5711 in / 1236 out tokens · 28287 ms · 2026-06-26T01:47:32.514176+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 2 canonical work pages

[1]

Geometric deep learning: going beyond euclidean data

Michael M Bronstein et al. “Geometric deep learning: going beyond euclidean data”. In:IEEE Signal Processing Magazine34.4 (2017), pp. 18–42

2017
[2]

A group-theoretic framework for data augmentation

Shuxiao Chen, Edgar Dobriban, and Jane H Lee. “A group-theoretic framework for data augmentation”. In:The Journal of Machine Learning Research21.1 (2020), pp. 9885–9955

2020
[3]

Swallowing the Bitter Pill: Simplified Scalable Conformer Generation

Yuyang Wang et al. “Swallowing the Bitter Pill: Simplified Scalable Conformer Generation”. In:Proceedings of the 41st International Conference on Machine Learning. PMLR, July 2024, pp. 50400–50418. arXiv:2311.17932

arXiv 2024
[4]

Emergent Equivariance in Deep Ensembles

Jan E. Gerken and Pan Kessel. “Emergent Equivariance in Deep Ensembles”. In:Proceedings of the 41st International Conference on Machine Learning. PMLR, July 2024, pp. 15438–15465. arXiv:2403.03103

arXiv 2024
[5]

Oskar Nordenfors and Axel Flinth.Ensembles provably learn equivariance through data augmentation. 2025. arXiv: 2410.01452

arXiv 2025
[6]

Optimization Dynamics of Equivariant and Augmented Neural Networks

Oskar Nordenfors, Fredrik Ohlsson, and Axel Flinth. “Optimization Dynamics of Equivariant and Augmented Neural Networks”. In:Transactions on Machine Learning Research(2025)

2025
[7]

Group Equivariant Convolutional Networks

Taco Cohen and Max Welling. “Group Equivariant Convolutional Networks”. In:Proceedings of The 33rd International Conference on Machine Learning. PMLR, June 2016, pp. 2990–2999. arXiv:1602.07576

Pith/arXiv arXiv 2016
[8]

On the generalization of equivariance and convolution in neural networks to the action of compact groups

Risi Kondor and Shubhendu Trivedi. “On the generalization of equivariance and convolution in neural networks to the action of compact groups”. In:International Conference on Machine Learning. PMLR. 2018, pp. 2747–2755

2018
[9]

Universal invariant and equivariant graph neural networks

Nicolas Keriven and Gabriel Peyr ´e. “Universal invariant and equivariant graph neural networks”. In:Advances in neural information processing systems32 (2019)

2019
[10]

Geometricdeeplearningandequivariantneuralnetworks.ArtificialIntelligence Review, 56(12):14605–14662, December 2023

Jan E. Gerken et al. “Geometric Deep Learning and Equivariant Neural Networks”. In:Artificial Intelligence Review (June 2023).issn: 1573-7462.doi:10.1007/s10462-023-10502-7. arXiv:2105.13926. 11

work page doi:10.1007/s10462-023-10502-7 2023
[11]

Scalars are universal: Equivariant machine learning, structured like classical physics

Soledad Villar et al. “Scalars are universal: Equivariant machine learning, structured like classical physics”. In:Advances in Neural Information Processing Systems34 (2021), pp. 28848–28863

2021
[12]

Frame Averaging for Invariant and Equivariant Network Design

Omri Puny et al. “Frame Averaging for Invariant and Equivariant Network Design”. In:International Conference on Learning Representations. 2022

2022
[13]

Group invariant machine learning by fundamental domain projec- tions

Benjamin Aslan, Daniel Platt, and David Sheard. “Group invariant machine learning by fundamental domain projec- tions”. In:NeurIPS Workshop on Symmetry and Geometry in Neural Representations. PMLR. 2023, pp. 181–218

2023
[14]

Approximately equivariant networks for imperfectly symmetric dynamics

Rui Wang, Robin Walters, and Rose Yu. “Approximately equivariant networks for imperfectly symmetric dynamics”. In:International Conference on Machine Learning. PMLR. 2022, pp. 23078–23091

2022
[15]

Clare Lyle et al.On the benefits of invariance in neural networks. 2020. arXiv:2005.00178

arXiv 2020
[16]

Provably strict generalisation benefit for equivariant models

Bryn Elesedy and Sheheryar Zaidi. “Provably strict generalisation benefit for equivariant models”. In:Proceedings of the 38th International Conference on Machine Learning. PMLR. 2021, pp. 2959–2969

2021
[17]

Implicit Bias of Linear Equivariant Networks

Hannah Lawrence et al. “Implicit Bias of Linear Equivariant Networks”. In:International Conference on Machine Learning. PMLR. 2022, pp. 12096–12125

2022
[18]

On the Implicit Bias of Linear Equivariant Steerable Networks

Ziyu Chen and Wei Zhu. “On the Implicit Bias of Linear Equivariant Steerable Networks”. In:Advances in Neural Information Processing Systems36 (2024)

2024
[19]

June 2025

Hao Duan and Guido Mont´ ufar.Understanding Learning Invariance in Deep Linear Networks. June 2025. arXiv: 2506.13714

arXiv 2025
[20]

Data Augmentation and Regularization for Learning Group Equivariance

Oskar Nordenfors and Axel Flinth. “Data Augmentation and Regularization for Learning Group Equivariance”. In:2025 International Conference on Sampling Theory and Applications (SampTA). 2025, pp. 1–5

2025
[21]

Training or Architecture? How to Incorporate Invariance in Neural Networks

Kanchana Vaishnavi Gandikota et al. “Training or Architecture? How to Incorporate Invariance in Neural Networks”. In:arXiv:2106.10044(June 18, 2021). arXiv:2106.10044

arXiv 2021
[22]

Equivariance versus Augmentation for Spherical Images

Jan Gerken et al. “Equivariance versus Augmentation for Spherical Images”. In:Proceedings of the 39th International Conference on Machine Learning. PMLR, 2022, pp. 7404–7421

2022
[23]

Does Equivariance Matter at Scale?

Johann Brehmer et al. “Does Equivariance Matter at Scale?” In:Transactions on Machine Learning Research(Apr. 2025).issn: 2835-8856. arXiv:2410.23179

arXiv 2025
[24]

A Practical Bayesian Framework for Backpropagation Networks

David J. C. MacKay. “A Practical Bayesian Framework for Backpropagation Networks”. In:Neural Computation4.3 (May 1992), pp. 448–472.issn: 0899-7667.doi:10.1162/neco.1992.4.3.448

work page doi:10.1162/neco.1992.4.3.448 1992
[25]

Uncertainty in Deep Learning

Yarin Gal. “Uncertainty in Deep Learning”. PhD thesis. University of Cambridge, 2016

2016
[26]

Practical Variational Inference for Neural Networks

Alex Graves. “Practical Variational Inference for Neural Networks”. In:Advances in Neural Information Processing Systems. Ed. by J. Shawe-Taylor et al. Vol. 24. Curran Associates, Inc., 2011

2011
[27]

Weight Uncertainty in Neural Network

Charles Blundell et al. “Weight Uncertainty in Neural Network”. In:Proceedings of the 32nd International Conference on Machine Learning. PMLR, June 2015, pp. 1613–1622

2015
[28]

Diederik P Kingma and Max Welling.Auto-Encoding Variational Bayes. 2022. arXiv:1312.6114

Pith/arXiv arXiv 2022
[29]

Hands-On Bayesian Neural Networks—A Tutorial for Deep Learning Users

Laurent Valentin Jospin et al. “Hands-On Bayesian Neural Networks—A Tutorial for Deep Learning Users”. In:IEEE Computational Intelligence Magazine17.2 (May 2022), pp. 29–48.issn: 1556-6048.doi:10 . 1109 / MCI . 2022 . 3155327

2022
[30]

Learning invariant weights in neural networks

Tycho F.A. van der Ouderaa and Mark van der Wilk. “Learning invariant weights in neural networks”. In:Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence. Ed. by James Cussens and Kun Zhang. Vol. 180. Proceedings of Machine Learning Research. PMLR, Jan. 2022, pp. 1992–2001

2022
[31]

A Bayesian Approach to Invariant Deep Neural Networks

Nikolaos Mourdoukoutas et al. “A Bayesian Approach to Invariant Deep Neural Networks”. In:arXiv:2107.09301 [cs, stat](July 2021). arXiv:2107.09301

arXiv 2021
[32]

Bishop.Pattern Recognition and Machine Learning

Christopher M. Bishop.Pattern Recognition and Machine Learning. Information Science and Statistics. New York: Springer, 2006.isbn: 978-0-387-31073-2

2006
[33]

Han Xiao, Kashif Rasul, and Roland Vollgraf.Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. 2017. arXiv:1708.07747

Pith/arXiv arXiv 2017
[34]

On the method of bounded differences

Colin McDiarmid. “On the method of bounded differences”. In:Surveys in Combinatorics, 1989: Invited Papers at the Twelfth British Combinatorial Conference. Ed. by J.Editor Siemons. London Mathematical Society Lecture Note Series. Cambridge University Press, 1989, pp. 148–188

1989
[35]

Regularity Properties of Certain Families of Chance Variables

J. L. Doob. “Regularity Properties of Certain Families of Chance Variables”. In:Transactions of the American Mathe- matical Society47.3 (1940), pp. 455–486.issn: 00029947, 10886850. 12 A From discrete to continuous compact groups It is often the case in the geometric deep learning literature that virtually all results concerning finite groups can be gener...

arXiv 1940

[1] [1]

Geometric deep learning: going beyond euclidean data

Michael M Bronstein et al. “Geometric deep learning: going beyond euclidean data”. In:IEEE Signal Processing Magazine34.4 (2017), pp. 18–42

2017

[2] [2]

A group-theoretic framework for data augmentation

Shuxiao Chen, Edgar Dobriban, and Jane H Lee. “A group-theoretic framework for data augmentation”. In:The Journal of Machine Learning Research21.1 (2020), pp. 9885–9955

2020

[3] [3]

Swallowing the Bitter Pill: Simplified Scalable Conformer Generation

Yuyang Wang et al. “Swallowing the Bitter Pill: Simplified Scalable Conformer Generation”. In:Proceedings of the 41st International Conference on Machine Learning. PMLR, July 2024, pp. 50400–50418. arXiv:2311.17932

arXiv 2024

[4] [4]

Emergent Equivariance in Deep Ensembles

Jan E. Gerken and Pan Kessel. “Emergent Equivariance in Deep Ensembles”. In:Proceedings of the 41st International Conference on Machine Learning. PMLR, July 2024, pp. 15438–15465. arXiv:2403.03103

arXiv 2024

[5] [5]

Oskar Nordenfors and Axel Flinth.Ensembles provably learn equivariance through data augmentation. 2025. arXiv: 2410.01452

arXiv 2025

[6] [6]

Optimization Dynamics of Equivariant and Augmented Neural Networks

Oskar Nordenfors, Fredrik Ohlsson, and Axel Flinth. “Optimization Dynamics of Equivariant and Augmented Neural Networks”. In:Transactions on Machine Learning Research(2025)

2025

[7] [7]

Group Equivariant Convolutional Networks

Taco Cohen and Max Welling. “Group Equivariant Convolutional Networks”. In:Proceedings of The 33rd International Conference on Machine Learning. PMLR, June 2016, pp. 2990–2999. arXiv:1602.07576

Pith/arXiv arXiv 2016

[8] [8]

On the generalization of equivariance and convolution in neural networks to the action of compact groups

Risi Kondor and Shubhendu Trivedi. “On the generalization of equivariance and convolution in neural networks to the action of compact groups”. In:International Conference on Machine Learning. PMLR. 2018, pp. 2747–2755

2018

[9] [9]

Universal invariant and equivariant graph neural networks

Nicolas Keriven and Gabriel Peyr ´e. “Universal invariant and equivariant graph neural networks”. In:Advances in neural information processing systems32 (2019)

2019

[10] [10]

Geometricdeeplearningandequivariantneuralnetworks.ArtificialIntelligence Review, 56(12):14605–14662, December 2023

Jan E. Gerken et al. “Geometric Deep Learning and Equivariant Neural Networks”. In:Artificial Intelligence Review (June 2023).issn: 1573-7462.doi:10.1007/s10462-023-10502-7. arXiv:2105.13926. 11

work page doi:10.1007/s10462-023-10502-7 2023

[11] [11]

Scalars are universal: Equivariant machine learning, structured like classical physics

Soledad Villar et al. “Scalars are universal: Equivariant machine learning, structured like classical physics”. In:Advances in Neural Information Processing Systems34 (2021), pp. 28848–28863

2021

[12] [12]

Frame Averaging for Invariant and Equivariant Network Design

Omri Puny et al. “Frame Averaging for Invariant and Equivariant Network Design”. In:International Conference on Learning Representations. 2022

2022

[13] [13]

Group invariant machine learning by fundamental domain projec- tions

Benjamin Aslan, Daniel Platt, and David Sheard. “Group invariant machine learning by fundamental domain projec- tions”. In:NeurIPS Workshop on Symmetry and Geometry in Neural Representations. PMLR. 2023, pp. 181–218

2023

[14] [14]

Approximately equivariant networks for imperfectly symmetric dynamics

Rui Wang, Robin Walters, and Rose Yu. “Approximately equivariant networks for imperfectly symmetric dynamics”. In:International Conference on Machine Learning. PMLR. 2022, pp. 23078–23091

2022

[15] [15]

Clare Lyle et al.On the benefits of invariance in neural networks. 2020. arXiv:2005.00178

arXiv 2020

[16] [16]

Provably strict generalisation benefit for equivariant models

Bryn Elesedy and Sheheryar Zaidi. “Provably strict generalisation benefit for equivariant models”. In:Proceedings of the 38th International Conference on Machine Learning. PMLR. 2021, pp. 2959–2969

2021

[17] [17]

Implicit Bias of Linear Equivariant Networks

Hannah Lawrence et al. “Implicit Bias of Linear Equivariant Networks”. In:International Conference on Machine Learning. PMLR. 2022, pp. 12096–12125

2022

[18] [18]

On the Implicit Bias of Linear Equivariant Steerable Networks

Ziyu Chen and Wei Zhu. “On the Implicit Bias of Linear Equivariant Steerable Networks”. In:Advances in Neural Information Processing Systems36 (2024)

2024

[19] [19]

June 2025

Hao Duan and Guido Mont´ ufar.Understanding Learning Invariance in Deep Linear Networks. June 2025. arXiv: 2506.13714

arXiv 2025

[20] [20]

Data Augmentation and Regularization for Learning Group Equivariance

Oskar Nordenfors and Axel Flinth. “Data Augmentation and Regularization for Learning Group Equivariance”. In:2025 International Conference on Sampling Theory and Applications (SampTA). 2025, pp. 1–5

2025

[21] [21]

Training or Architecture? How to Incorporate Invariance in Neural Networks

Kanchana Vaishnavi Gandikota et al. “Training or Architecture? How to Incorporate Invariance in Neural Networks”. In:arXiv:2106.10044(June 18, 2021). arXiv:2106.10044

arXiv 2021

[22] [22]

Equivariance versus Augmentation for Spherical Images

Jan Gerken et al. “Equivariance versus Augmentation for Spherical Images”. In:Proceedings of the 39th International Conference on Machine Learning. PMLR, 2022, pp. 7404–7421

2022

[23] [23]

Does Equivariance Matter at Scale?

Johann Brehmer et al. “Does Equivariance Matter at Scale?” In:Transactions on Machine Learning Research(Apr. 2025).issn: 2835-8856. arXiv:2410.23179

arXiv 2025

[24] [24]

A Practical Bayesian Framework for Backpropagation Networks

David J. C. MacKay. “A Practical Bayesian Framework for Backpropagation Networks”. In:Neural Computation4.3 (May 1992), pp. 448–472.issn: 0899-7667.doi:10.1162/neco.1992.4.3.448

work page doi:10.1162/neco.1992.4.3.448 1992

[25] [25]

Uncertainty in Deep Learning

Yarin Gal. “Uncertainty in Deep Learning”. PhD thesis. University of Cambridge, 2016

2016

[26] [26]

Practical Variational Inference for Neural Networks

Alex Graves. “Practical Variational Inference for Neural Networks”. In:Advances in Neural Information Processing Systems. Ed. by J. Shawe-Taylor et al. Vol. 24. Curran Associates, Inc., 2011

2011

[27] [27]

Weight Uncertainty in Neural Network

Charles Blundell et al. “Weight Uncertainty in Neural Network”. In:Proceedings of the 32nd International Conference on Machine Learning. PMLR, June 2015, pp. 1613–1622

2015

[28] [28]

Diederik P Kingma and Max Welling.Auto-Encoding Variational Bayes. 2022. arXiv:1312.6114

Pith/arXiv arXiv 2022

[29] [29]

Hands-On Bayesian Neural Networks—A Tutorial for Deep Learning Users

Laurent Valentin Jospin et al. “Hands-On Bayesian Neural Networks—A Tutorial for Deep Learning Users”. In:IEEE Computational Intelligence Magazine17.2 (May 2022), pp. 29–48.issn: 1556-6048.doi:10 . 1109 / MCI . 2022 . 3155327

2022

[30] [30]

Learning invariant weights in neural networks

Tycho F.A. van der Ouderaa and Mark van der Wilk. “Learning invariant weights in neural networks”. In:Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence. Ed. by James Cussens and Kun Zhang. Vol. 180. Proceedings of Machine Learning Research. PMLR, Jan. 2022, pp. 1992–2001

2022

[31] [31]

A Bayesian Approach to Invariant Deep Neural Networks

Nikolaos Mourdoukoutas et al. “A Bayesian Approach to Invariant Deep Neural Networks”. In:arXiv:2107.09301 [cs, stat](July 2021). arXiv:2107.09301

arXiv 2021

[32] [32]

Bishop.Pattern Recognition and Machine Learning

Christopher M. Bishop.Pattern Recognition and Machine Learning. Information Science and Statistics. New York: Springer, 2006.isbn: 978-0-387-31073-2

2006

[33] [33]

Han Xiao, Kashif Rasul, and Roland Vollgraf.Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. 2017. arXiv:1708.07747

Pith/arXiv arXiv 2017

[34] [34]

On the method of bounded differences

Colin McDiarmid. “On the method of bounded differences”. In:Surveys in Combinatorics, 1989: Invited Papers at the Twelfth British Combinatorial Conference. Ed. by J.Editor Siemons. London Mathematical Society Lecture Note Series. Cambridge University Press, 1989, pp. 148–188

1989

[35] [35]

Regularity Properties of Certain Families of Chance Variables

J. L. Doob. “Regularity Properties of Certain Families of Chance Variables”. In:Transactions of the American Mathe- matical Society47.3 (1940), pp. 455–486.issn: 00029947, 10886850. 12 A From discrete to continuous compact groups It is often the case in the geometric deep learning literature that virtually all results concerning finite groups can be gener...

arXiv 1940