Stimulus symmetries can confound representational similarity analyses
Pith reviewed 2026-05-21 03:21 UTC · model grok-4.3
The pith
Stimulus symmetries can produce different RSMs for functionally equivalent neural representations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs that reflect qualitatively different representational geometries. SGD or energetic regularization generates sparse drifting codes leading to drifting RSMs, present even in image-trained networks with latent symmetry. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.
What carries the argument
The mapping from stimulus symmetries to multiple distinct neural codes that remain functionally equivalent yet yield non-equivalent RSM geometries unless the codes differ only by rotation.
If this is right
- RSM comparisons can fail to detect that two representations are functionally equivalent under input symmetry.
- Sparse codes induced by SGD or regularization cause RSMs to drift even after training converges.
- The effect persists in networks trained on image data where the symmetry is latent.
- Nonlinear codes not related by rotation cannot be compared reliably with standard RSM methods.
Where Pith is reading between the lines
- Analyses of representational geometry in neuroscience may need explicit tests for invariance under possible input transformations.
- Drifting RSMs could contribute to variability seen across repeated experiments on similar stimuli.
- Symmetry-aware summary statistics might reduce the confounding without requiring changes to the network itself.
Load-bearing premise
That functionally equivalent representations under stimulus symmetries will yield distinguishable RSMs when the codes are nonlinear and not related by rotation.
What would settle it
Train networks on inputs with known explicit symmetries, then check whether the RSMs stay identical across equivalent codes or continue to drift under SGD.
Figures
read the original abstract
What can representational similarity matrices (RSMs) tell us about a neural code? As the popularity of these summary statistics grows, so too does the need for a more complete characterization of their properties. Here, we show that symmetries in network inputs can confound RSM-based analyses. Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs. These different RSMs reflect qualitatively different representational geometries. We show that stochastic gradient descent or energetic regularization can generate sparse, drifting codes, leading in turn to drifting RSMs. Moreover, we demonstrate that these phenomena are present in networks trained to encode image data, where the symmetry is latent. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that stimulus symmetries render many neural representations functionally equivalent, yet these can yield qualitatively different representational similarity matrices (RSMs) that reflect distinct geometries. It further argues that SGD or energetic regularization produces sparse drifting codes that cause drifting RSMs, and demonstrates the effect even in image-trained networks where the symmetry is latent. The central conclusion is that RSM-based comparisons of nonlinear codes are confounded when functionally equivalent representations are not related by simple rotations.
Significance. If the core observations hold after verification, the result would be moderately significant for the RSA literature in systems neuroscience and machine learning. It would highlight a previously under-appreciated confound arising from input symmetries and optimization dynamics, potentially motivating more rigorous controls when RSMs are used to compare representations across networks or conditions. The demonstration in latent-symmetry image networks is a strength if the symmetry group and functional equivalence are properly isolated.
major comments (2)
- [Abstract / central claim] The central claim requires explicit verification that the observed codes differ by a symmetry-induced mapping (rather than arbitrary nonlinear reparameterization) while preserving task performance. The skeptic note correctly identifies that without a section or equation defining functional equivalence under the group action and showing invariance of task metrics, the confounding effect of symmetries is not isolated from other sources of representational drift.
- [Abstract / results description] The abstract describes demonstrations of drifting RSMs under SGD/regularization but provides no details on controls, error bars, or how drifting is quantified versus baseline variability. This is load-bearing for the claim that drifting codes are generated by regularization rather than other factors; a methods or results section should report these quantifications.
minor comments (1)
- [Methods] Clarify how symmetries are detected or controlled in the training data and architecture, particularly for the latent-symmetry image case where the symmetry group is not enumerated.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our manuscript. We address each major comment below and indicate the revisions made.
read point-by-point responses
-
Referee: [Abstract / central claim] The central claim requires explicit verification that the observed codes differ by a symmetry-induced mapping (rather than arbitrary nonlinear reparameterization) while preserving task performance. The skeptic note correctly identifies that without a section or equation defining functional equivalence under the group action and showing invariance of task metrics, the confounding effect of symmetries is not isolated from other sources of representational drift.
Authors: We agree that making the definition of functional equivalence explicit strengthens the manuscript. In the revised version, we have added a dedicated subsection in the Methods titled 'Defining Functional Equivalence under Group Actions' that includes the mathematical definition of the symmetry group acting on stimuli and the induced mapping on representations. We show that task performance metrics remain invariant under these mappings by proving that the output of the network is unchanged when inputs are transformed by the symmetry group. This isolates the effect from arbitrary nonlinear reparameterizations, as the mappings are explicitly constructed from the group action. revision: yes
-
Referee: [Abstract / results description] The abstract describes demonstrations of drifting RSMs under SGD/regularization but provides no details on controls, error bars, or how drifting is quantified versus baseline variability. This is load-bearing for the claim that drifting codes are generated by regularization rather than other factors; a methods or results section should report these quantifications.
Authors: We appreciate this point and have expanded the manuscript accordingly. The revised Methods section now details the controls used, including comparisons to networks trained without regularization and with different random seeds. Drifting is quantified using the average pairwise distance between RSMs computed at different training epochs, with error bars representing standard deviation across 10 independent runs. We also include a baseline variability measure from fixed codes. These details are now reported in the Results section as well, with a new figure panel illustrating the quantification. revision: yes
Circularity Check
No circularity: empirical demonstrations of symmetry effects on RSMs are self-contained
full rationale
The paper presents its core results as outcomes of explicit simulations and network training experiments (SGD/regularization producing sparse drifting codes and drifting RSMs, including in image-trained networks with latent symmetry). These are shown via concrete examples of different representational geometries arising from functionally equivalent codes, without any load-bearing step that reduces by definition, by fitted-parameter renaming, or by self-citation chain to the target claim. The abstract and described results treat the confounding as an observed phenomenon rather than a derived identity.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs... gauge transformation... (g·h)(x)=h(g^{-1}·x)
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
RSM gauge-dependence... only if there exists an orthogonal matrix O(g) such that (g·h)(x)=O(g)h(x)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Learning lie groups for invariant visual perception
Rajesh Rao and Daniel Ruderman. Learning lie groups for invariant visual perception. In M. Kearns, S. Solla, and D. Cohn, editors,Advances in Neural Information Processing Systems, volume 11. MIT Press, 1998. URL https://proceedings.neurips.cc/paper_files/ paper/1998/file/277281aada22045c03945dcb2ca6f2ec-Paper.pdf
work page 1998
-
[2]
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478, 2021. URL https://geometricdeeplearning.com/
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[3]
Transformation Properties of Learned Visual Representations
Taco S. Cohen and Max Welling. Transformation properties of learned visual representations. InInternational Conference on Learning Representations, 2015. URL https://arxiv.org/ abs/1412.7659
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[4]
Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2747–2755. PMLR, 10–15 Jul 2018. URL https...
work page 2018
-
[5]
Towards a Definition of Disentangled Representations
Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a definition of disentangled representations.arXiv, 2018. URLhttps://arxiv.org/abs/1812.02230
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[6]
Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B
Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine Hermann, Kerem Oktar, Klaus Greff, Martin N Hebart, Nathan Cloos, Nikolaus Kriegeskorte, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert...
work page 2025
-
[7]
Position: The pla- tonic representation hypothesis
Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Position: The pla- tonic representation hypothesis. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceed- ings of the 41st International Conference on Machine Learning, volume 235 ofProceed- ings of Machine...
work page 2024
-
[8]
On the symme- tries of deep learning models and their internal representations
Charles Godfrey, Davis Brown, Tegan Emerson, and Henry Kvinge. On the symme- tries of deep learning models and their internal representations. In S. Koyejo, S. Mo- hamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neu- ral Information Processing Systems, volume 35, pages 11893–11905. Curran Associates, Inc., 2022. URL https://proceedi...
work page 2022
-
[9]
What representational similarity measures imply about decodable information
Sarah E Harvey, David Lipshutz, and Alex H Williams. What representational similarity measures imply about decodable information. InUniReps: 2nd Edition of the Workshop on Unifying Representations in Neural Models, 2024. URL https://openreview.net/forum? id=hqfzH6GCYj
work page 2024
-
[10]
Nikolaus Kriegeskorte, Marieke Mur, and Peter A Bandettini. Representational similarity analysis-connecting the branches of systems neuroscience.Frontiers in systems neuroscience, 2:249, 2008
work page 2008
-
[11]
Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021
Nikolaus Kriegeskorte and Xue-Xin Wei. Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021. ISSN 1471-0048. doi:10.1038/s41583-021- 00502-3. URLhttps://doi.org/10.1038/s41583-021-00502-3. 11
-
[12]
Alex H. Williams. Equivalence between representational similarity analysis, centered kernel alignment, and canonical correlations analysis.bioRxiv, 2024. doi:10.1101/2024.10.23.619871. URLhttps://www.biorxiv.org/content/early/2024/10/24/2024.10.23.619871
-
[13]
Yena Han, Tomaso A Poggio, and Brian Cheung. System identification of neural systems: If we got it right, would we know? In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Resear...
work page 2023
-
[14]
Readout representation: Redefining neural codes by input recovery
Shunsuke Onoo, Yoshihiro Nagano, and Yukiyasu Kamitani. Readout representation: Redefining neural codes by input recovery. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=pODHH9DLeA
work page 2026
-
[15]
D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and func- tional architecture in the cat’s visual cortex.The Journal of Physiology, 160(1):106– 154, 1962. doi:https://doi.org/10.1113/jphysiol.1962.sp006837. URL https://physoc. onlinelibrary.wiley.com/doi/abs/10.1113/jphysiol.1962.sp006837
-
[16]
Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks
Anirvan Sengupta, Cengiz Pehlevan, Mariano Tepper, Alexander Genkin, and Dmitri Chklovskii. Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 31. Curran Assoc...
work page 2018
-
[17]
Gupta, Neereja Sundaresan, Thomas Alexander, Christopher J
Richard J. Gardner, Erik Hermansen, Marius Pachitariu, Yoram Burak, Nils A. Baas, Ben- jamin A. Dunn, May-Britt Moser, and Edvard I. Moser. Toroidal topology of population activity in grid cells.Nature, 602(7895):123–128, Feb 2022. ISSN 1476-4687. doi:10.1038/s41586- 021-04268-7. URLhttps://doi.org/10.1038/s41586-021-04268-7
-
[18]
J. D. Jackson and L. B. Okun. Historical roots of gauge invariance.Rev. Mod. Phys., 73: 663–680, Sep 2001. doi:10.1103/RevModPhys.73.663. URL https://link.aps.org/doi/ 10.1103/RevModPhys.73.663
-
[19]
Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition,
-
[20]
doi:10.1017/CBO9781139020411
-
[21]
Scalable Funding of Bitcoin Micropayment Channel Networks
B.C. Hall.Lie Groups, Lie Algebras, and Representations: An Elementary Introduction. Graduate Texts in Mathematics. Springer, 2003. ISBN 9780387401225. doi:10.1007/978-3- 319-13467-3
-
[22]
The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024
Luca Maria Del Bono, Flavio Nicoletti, and Federico Ricci-Tersenghi. The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024. doi:10.1371/journal.pone.0313863. URL https://doi.org/10.1371/journal.pone. 0313863
-
[23]
Deep Learning for Classical Japanese Literature
Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex Lamb, Kazuaki Yamamoto, and David Ha. Deep learning for Classical Japanese literature. InWorkshop on Machine Learning for Creativity and Design, NeurIPS 2018, 2018. URLhttps://arxiv.org/abs/1812.01718
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[24]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016
work page 2016
-
[25]
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022
work page 2022
-
[26]
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision.arXiv preprint arXiv:2304.07193, 2023. 12
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[27]
Understanding image representations by mea- suring their equivariance and equivalence
Karel Lenc and Andrea Vedaldi. Understanding image representations by mea- suring their equivariance and equivalence. InProceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR), June 2015. URL https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Lenc_ Understanding_Image_Representations_2015_CVPR_paper.html
work page 2015
-
[28]
Robert-Jan Bruintjes, Tomasz Motyka, and Jan van Gemert. What affects learned equiv- ariance in deep image recognition models? In2023 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition Workshops (CVPRW), pages 4839–4847, 2023. doi:10.1109/CVPRW59228.2023.00512
-
[29]
Exploring the landscape of spatial robustness
Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. Exploring the landscape of spatial robustness. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 1802–1811. PMLR, 09–15 Jun 2019. URL ...
work page 2019
-
[30]
Andrew Kyle Lampinen, Stephanie C. Y . Chan, Yuxuan Li, and Katherine Hermann. Represen- tation biases: will we achieve complete understanding by analyzing representations?arXiv,
- [31]
-
[32]
Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan
Jacob A. Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan. Summary statistics of learning link changing neural representations to behavior.Frontiers in Neural Circuits, 19, 2025. ISSN 1662-5110. doi:10.3389/fncir.2025.1618351. URL https://www.frontiersin.org/ journals/neural-circuits/articles/10.3389/fncir.2025.1618351
-
[33]
Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019
Michael E Rule, Timothy O’Leary, and Christopher D Harvey. Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019. ISSN 0959-4388. doi:https://doi.org/10.1016/j.conb.2019.08.005. URL https://www.sciencedirect.com/ science/article/pii/S0959438819300303
-
[34]
Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022
Paul Masset, Shanshan Qin, and Jacob A Zavatone-Veth. Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022. doi:10.1007/s00422-021-00916-3
-
[35]
Shanshan Qin, Shiva Farashahi, David Lipshutz, Anirvan M. Sengupta, Dmitri B. Chklovskii, and Cengiz Pehlevan. Coordinated drift of receptive fields in Hebbian/anti-Hebbian network models during noisy representation learning.Nature Neuroscience, 26(2):339–349, 2023. ISSN 1546-1726. doi:10.1038/s41593-022-01225-z. URL https://doi.org/10.1038/ s41593-022-01225-z
-
[36]
Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network
Farhad Pashakhanloo and Alexei Koulakov. Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 27401– 27419. PMLR, 2023. URL https://proceedings.mlr.press/v202/pashakhanloo23a. html
work page 2023
-
[37]
Contribution of task-irrelevant stimuli to drift of neural representations
Farhad Pashakhanloo. Contribution of task-irrelevant stimuli to drift of neural representations. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=jAoqtT58G4
work page 2025
-
[38]
Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021
Carl E Schoonover, Sarah N Ohashi, Richard Axel, and Andrew JP Fink. Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021. doi:10.1038/s41586-021-03628- 7
-
[39]
Loss landscapes of regularized linear autoencoders
Daniel Kunin, Jonathan Bloom, Aleksandrina Goeva, and Cotton Seed. Loss landscapes of regularized linear autoencoders. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 ofPro- ceedings of Machine Learning Research, pages 3560–3569. PMLR, 09–15 Jun 2019. URL https://proc...
work page 2019
-
[40]
James C. R. Whittington, Will Dorrell, Surya Ganguli, and Timothy Behrens. Disentanglement with biological constraints: A theory of functional cell types. InThe Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum? id=9Z_GfhZnGH. 13
work page 2023
-
[41]
Latham, Timothy Edward John Behrens, and James C
Will Dorrell, Kyle Hsu, Luke Hollingsworth, Jin Hwa Lee, Jiajun Wu, Chelsea Finn, Peter E. Latham, Timothy Edward John Behrens, and James C. R. Whittington. Range, not independence, drives modularity in biologically inspired representations. InThe Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum? id=B...
work page 2025
-
[42]
Lukas Braun, Erin Grant, and Andrew M Saxe. Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks. InForty- second International Conference on Machine Learning, 2025. URL https://openreview. net/forum?id=YucuAuXMpT
work page 2025
-
[43]
On privileged and convergent bases in neural network representations.arXiv, 2023
Davis Brown, Nikhil Vyas, and Yamini Bansal. On privileged and convergent bases in neural network representations.arXiv, 2023. URLhttps://arxiv.org/abs/2307.12941
-
[44]
Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024
Meenakshi Khosla, Alex H Williams, Josh McDermott, and Nancy Kanwisher. Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024. doi:10.1101/2024.06.20.599957. URL https://www.biorxiv.org/content/early/ 2024/06/20/2024.06.20.599957
-
[45]
Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022
Blake Bordelon and Cengiz Pehlevan. Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022. ISSN 2050-084X. doi:10.7554/eLife.78606. URLhttps://doi.org/10.7554/eLife.78606
-
[46]
Jacob A Zavatone-Veth, Sheng Yang, Julian Alex Rubinfien, and Cengiz Pehlevan. How does training shape the Riemannian geometry of neural network representations? InNeurIPS 2025 Workshop on Symmetry and Geometry in Neural Representations, 2025. URL https: //openreview.net/forum?id=BaVIDhh7bj
work page 2025
-
[47]
N Alex Cayco Gajic and Arthur Pellegrino. Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds.arXiv, 2026. URL https://arxiv. org/abs/2603.28764
-
[48]
Kernel methods for deep learning
Youngmin Cho and Lawrence K Saul. Kernel methods for deep learning. In Y . Bengio, D. Schu- urmans, J. Lafferty, C. Williams, and A. Culotta, editors,Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc., 2009. URL https://proceedings. neurips.cc/paper/2009/file/5751ec3e9a4feab575962e78e006250d-Paper.pdf
work page 2009
-
[49]
Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances
Berfin Simsek, François Ged, Arthur Jacot, Francesco Spadaro, Clement Hongler, Wulfram Gerstner, and Johanni Brea. Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine...
work page 2021
-
[50]
Semi-flat minima and saddle points by embedding neural networks to overparameterization
Kenji Fukumizu, Shoichiro Yamaguchi, Yoh-ichi Mototake, and Mirai Tanaka. Semi-flat minima and saddle points by embedding neural networks to overparameterization. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, edi- tors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL h...
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.