Stimulus symmetries can confound representational similarity analyses

Farhad Pashakhanloo; Jacob A. Zavatone-Veth

arxiv: 2605.21324 · v1 · pith:23JEMS67new · submitted 2026-05-20 · 🧬 q-bio.NC · cs.LG

Stimulus symmetries can confound representational similarity analyses

Farhad Pashakhanloo , Jacob A. Zavatone-Veth This is my paper

Pith reviewed 2026-05-21 03:21 UTC · model grok-4.3

classification 🧬 q-bio.NC cs.LG

keywords representational similarityneural codesstimulus symmetriesRSMdrifting codesSGDneural networksrepresentational geometry

0 comments

The pith

Stimulus symmetries can produce different RSMs for functionally equivalent neural representations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that symmetries in the inputs to a network create many internally different but functionally equivalent codes. These codes can generate representational similarity matrices with qualitatively different geometries. Optimization methods like stochastic gradient descent or energetic regularization tend to produce sparse codes whose geometry drifts, so the RSMs drift as well. The same drifting behavior appears in networks trained on natural images even when the symmetry is latent rather than explicit. The result is that RSM-based comparisons become unreliable for nonlinear codes that are not related by a simple rotation.

Core claim

Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs that reflect qualitatively different representational geometries. SGD or energetic regularization generates sparse drifting codes leading to drifting RSMs, present even in image-trained networks with latent symmetry. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.

What carries the argument

The mapping from stimulus symmetries to multiple distinct neural codes that remain functionally equivalent yet yield non-equivalent RSM geometries unless the codes differ only by rotation.

If this is right

RSM comparisons can fail to detect that two representations are functionally equivalent under input symmetry.
Sparse codes induced by SGD or regularization cause RSMs to drift even after training converges.
The effect persists in networks trained on image data where the symmetry is latent.
Nonlinear codes not related by rotation cannot be compared reliably with standard RSM methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Analyses of representational geometry in neuroscience may need explicit tests for invariance under possible input transformations.
Drifting RSMs could contribute to variability seen across repeated experiments on similar stimuli.
Symmetry-aware summary statistics might reduce the confounding without requiring changes to the network itself.

Load-bearing premise

That functionally equivalent representations under stimulus symmetries will yield distinguishable RSMs when the codes are nonlinear and not related by rotation.

What would settle it

Train networks on inputs with known explicit symmetries, then check whether the RSMs stay identical across equivalent codes or continue to drift under SGD.

Figures

Figures reproduced from arXiv: 2605.21324 by Farhad Pashakhanloo, Jacob A. Zavatone-Veth.

**Figure 1.** Figure 1: Gauge-dependent representations in a toy model. a) Left: RF center vectors (wi) of four neurons tiling a one-dimensional ring. To specify a tiling, one must choose a global orientation φ. Right: corresponding angular tuning curves. b) Representations h1 and h2 of two trial stimuli s1 and s2 in the four-dimensional representation space. The angle between h1 and h2 depends on φ. c) RSMs for two values of the… view at source ↗

**Figure 2.** Figure 2: Gauge dependence and the geometry of representational manifolds. The schematics show a) the latent space Z = S 1 , and two examples of b) linear and c) nonlinear embeddings of it in R 3 . Points are color-coded by the stimulus angle, and angle zero is marked by star (⋆). The left and right columns have a gauge difference of ∆φ = π/2. The two manifolds in (b) can be transformed into each other using an orth… view at source ↗

**Figure 3.** Figure 3: Factors influencing RSM gauge dependence. a,b) RSM variability (∆) as a function of number of RFs and tuning width. In (b), the dashed red line shows the prediction from Eqn 12. c) RFs with amplitude noise (additive Gaussian noise with standard deviation 0.1). d) Corresponding φ-dependence of RSM over 100 realizations of noisy RFs (black line: average). We quantify the φ-dependence of the RSM by: ∆ := maxφ… view at source ↗

**Figure 4.** Figure 4: Collapse of neurons and RSM drift under continual SGD training. a) RSM values over time. b) Entropy of the distribution of cosine of pairwise angles between all neurons’ weights. c) RSMs (top), and the corresponding weight vectors (bottom) at different snapshots during training. Alignment of neurons’ weights (n = 15) into 4 orthogonal directions is evident through learning [PITH_FULL_IMAGE:figures/full_fi… view at source ↗

**Figure 6.** Figure 6: RSM variability as a result of continual training for three-dimensional input. (Left) Components of the empirical RSM as a function of time. The non-zero off-diagonal elements are grouped into three distinct sets (ρ1, ρ2, and ρ3) based on the predictions derived in Appendix D (this implies entries [0, 1],[0, 4],[1, 3], [3, 4], and their transpose vary as ρ1, and similarly for other groups). Gray curves cor… view at source ↗

**Figure 7.** Figure 7: RSM variability in autoencoding a rotated image manifold. Autoencoders were trained on rotated versions of a digit from Kuzushiji-MNIST dataset compiled by Clanuwat et al. [22]. a) Four examples of the original and reconstructed images in an instance of the trained model. b) 3D PCA of the hidden layer activations in a trained model in response to all stimuli. Each point is color-coded by the rotation angle… view at source ↗

read the original abstract

What can representational similarity matrices (RSMs) tell us about a neural code? As the popularity of these summary statistics grows, so too does the need for a more complete characterization of their properties. Here, we show that symmetries in network inputs can confound RSM-based analyses. Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs. These different RSMs reflect qualitatively different representational geometries. We show that stochastic gradient descent or energetic regularization can generate sparse, drifting codes, leading in turn to drifting RSMs. Moreover, we demonstrate that these phenomena are present in networks trained to encode image data, where the symmetry is latent. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that stimulus symmetries render many neural representations functionally equivalent, yet these can yield qualitatively different representational similarity matrices (RSMs) that reflect distinct geometries. It further argues that SGD or energetic regularization produces sparse drifting codes that cause drifting RSMs, and demonstrates the effect even in image-trained networks where the symmetry is latent. The central conclusion is that RSM-based comparisons of nonlinear codes are confounded when functionally equivalent representations are not related by simple rotations.

Significance. If the core observations hold after verification, the result would be moderately significant for the RSA literature in systems neuroscience and machine learning. It would highlight a previously under-appreciated confound arising from input symmetries and optimization dynamics, potentially motivating more rigorous controls when RSMs are used to compare representations across networks or conditions. The demonstration in latent-symmetry image networks is a strength if the symmetry group and functional equivalence are properly isolated.

major comments (2)

[Abstract / central claim] The central claim requires explicit verification that the observed codes differ by a symmetry-induced mapping (rather than arbitrary nonlinear reparameterization) while preserving task performance. The skeptic note correctly identifies that without a section or equation defining functional equivalence under the group action and showing invariance of task metrics, the confounding effect of symmetries is not isolated from other sources of representational drift.
[Abstract / results description] The abstract describes demonstrations of drifting RSMs under SGD/regularization but provides no details on controls, error bars, or how drifting is quantified versus baseline variability. This is load-bearing for the claim that drifting codes are generated by regularization rather than other factors; a methods or results section should report these quantifications.

minor comments (1)

[Methods] Clarify how symmetries are detected or controlled in the training data and architecture, particularly for the latent-symmetry image case where the symmetry group is not enumerated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our manuscript. We address each major comment below and indicate the revisions made.

read point-by-point responses

Referee: [Abstract / central claim] The central claim requires explicit verification that the observed codes differ by a symmetry-induced mapping (rather than arbitrary nonlinear reparameterization) while preserving task performance. The skeptic note correctly identifies that without a section or equation defining functional equivalence under the group action and showing invariance of task metrics, the confounding effect of symmetries is not isolated from other sources of representational drift.

Authors: We agree that making the definition of functional equivalence explicit strengthens the manuscript. In the revised version, we have added a dedicated subsection in the Methods titled 'Defining Functional Equivalence under Group Actions' that includes the mathematical definition of the symmetry group acting on stimuli and the induced mapping on representations. We show that task performance metrics remain invariant under these mappings by proving that the output of the network is unchanged when inputs are transformed by the symmetry group. This isolates the effect from arbitrary nonlinear reparameterizations, as the mappings are explicitly constructed from the group action. revision: yes
Referee: [Abstract / results description] The abstract describes demonstrations of drifting RSMs under SGD/regularization but provides no details on controls, error bars, or how drifting is quantified versus baseline variability. This is load-bearing for the claim that drifting codes are generated by regularization rather than other factors; a methods or results section should report these quantifications.

Authors: We appreciate this point and have expanded the manuscript accordingly. The revised Methods section now details the controls used, including comparisons to networks trained without regularization and with different random seeds. Drifting is quantified using the average pairwise distance between RSMs computed at different training epochs, with error bars representing standard deviation across 10 independent runs. We also include a baseline variability measure from fixed codes. These details are now reported in the Results section as well, with a new figure panel illustrating the quantification. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical demonstrations of symmetry effects on RSMs are self-contained

full rationale

The paper presents its core results as outcomes of explicit simulations and network training experiments (SGD/regularization producing sparse drifting codes and drifting RSMs, including in image-trained networks with latent symmetry). These are shown via concrete examples of different representational geometries arising from functionally equivalent codes, without any load-bearing step that reduces by definition, by fitted-parameter renaming, or by self-citation chain to the target claim. The abstract and described results treat the confounding as an observed phenomenon rather than a derived identity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the central claim implicitly assumes that functional equivalence under symmetry is well-defined and that RSM differences arise solely from representational geometry rather than measurement artifacts.

pith-pipeline@v0.9.0 · 5667 in / 1119 out tokens · 25039 ms · 2026-05-21T03:21:13.336796+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs... gauge transformation... (g·h)(x)=h(g^{-1}·x)
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

RSM gauge-dependence... only if there exists an orthogonal matrix O(g) such that (g·h)(x)=O(g)h(x)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 5 internal anchors

[1]

Learning lie groups for invariant visual perception

Rajesh Rao and Daniel Ruderman. Learning lie groups for invariant visual perception. In M. Kearns, S. Solla, and D. Cohn, editors,Advances in Neural Information Processing Systems, volume 11. MIT Press, 1998. URL https://proceedings.neurips.cc/paper_files/ paper/1998/file/277281aada22045c03945dcb2ca6f2ec-Paper.pdf

work page 1998
[2]

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478, 2021. URL https://geometricdeeplearning.com/

work page internal anchor Pith review Pith/arXiv arXiv 2021
[3]

Transformation Properties of Learned Visual Representations

Taco S. Cohen and Max Welling. Transformation properties of learned visual representations. InInternational Conference on Learning Representations, 2015. URL https://arxiv.org/ abs/1412.7659

work page internal anchor Pith review Pith/arXiv arXiv 2015
[4]

On the generalization of equivariance and convolution in neural networks to the action of compact groups

Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2747–2755. PMLR, 10–15 Jul 2018. URL https...

work page 2018
[5]

Towards a Definition of Disentangled Representations

Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a definition of disentangled representations.arXiv, 2018. URLhttps://arxiv.org/abs/1812.02230

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B

Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine Hermann, Kerem Oktar, Klaus Greff, Martin N Hebart, Nathan Cloos, Nikolaus Kriegeskorte, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert...

work page 2025
[7]

Position: The pla- tonic representation hypothesis

Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Position: The pla- tonic representation hypothesis. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceed- ings of the 41st International Conference on Machine Learning, volume 235 ofProceed- ings of Machine...

work page 2024
[8]

On the symme- tries of deep learning models and their internal representations

Charles Godfrey, Davis Brown, Tegan Emerson, and Henry Kvinge. On the symme- tries of deep learning models and their internal representations. In S. Koyejo, S. Mo- hamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neu- ral Information Processing Systems, volume 35, pages 11893–11905. Curran Associates, Inc., 2022. URL https://proceedi...

work page 2022
[9]

What representational similarity measures imply about decodable information

Sarah E Harvey, David Lipshutz, and Alex H Williams. What representational similarity measures imply about decodable information. InUniReps: 2nd Edition of the Workshop on Unifying Representations in Neural Models, 2024. URL https://openreview.net/forum? id=hqfzH6GCYj

work page 2024
[10]

Representational similarity analysis-connecting the branches of systems neuroscience.Frontiers in systems neuroscience, 2:249, 2008

Nikolaus Kriegeskorte, Marieke Mur, and Peter A Bandettini. Representational similarity analysis-connecting the branches of systems neuroscience.Frontiers in systems neuroscience, 2:249, 2008

work page 2008
[11]

Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021

Nikolaus Kriegeskorte and Xue-Xin Wei. Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021. ISSN 1471-0048. doi:10.1038/s41583-021- 00502-3. URLhttps://doi.org/10.1038/s41583-021-00502-3. 11

work page doi:10.1038/s41583-021- 2021
[12]

Williams

Alex H. Williams. Equivalence between representational similarity analysis, centered kernel alignment, and canonical correlations analysis.bioRxiv, 2024. doi:10.1101/2024.10.23.619871. URLhttps://www.biorxiv.org/content/early/2024/10/24/2024.10.23.619871

work page doi:10.1101/2024.10.23.619871 2024
[13]

Yena Han, Tomaso A Poggio, and Brian Cheung. System identification of neural systems: If we got it right, would we know? In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Resear...

work page 2023
[14]

Readout representation: Redefining neural codes by input recovery

Shunsuke Onoo, Yoshihiro Nagano, and Yukiyasu Kamitani. Readout representation: Redefining neural codes by input recovery. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=pODHH9DLeA

work page 2026
[15]

D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and func- tional architecture in the cat’s visual cortex.The Journal of Physiology, 160(1):106– 154, 1962. doi:https://doi.org/10.1113/jphysiol.1962.sp006837. URL https://physoc. onlinelibrary.wiley.com/doi/abs/10.1113/jphysiol.1962.sp006837

work page doi:10.1113/jphysiol.1962.sp006837 1962
[16]

Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks

Anirvan Sengupta, Cengiz Pehlevan, Mariano Tepper, Alexander Genkin, and Dmitri Chklovskii. Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 31. Curran Assoc...

work page 2018
[17]

Gupta, Neereja Sundaresan, Thomas Alexander, Christopher J

Richard J. Gardner, Erik Hermansen, Marius Pachitariu, Yoram Burak, Nils A. Baas, Ben- jamin A. Dunn, May-Britt Moser, and Edvard I. Moser. Toroidal topology of population activity in grid cells.Nature, 602(7895):123–128, Feb 2022. ISSN 1476-4687. doi:10.1038/s41586- 021-04268-7. URLhttps://doi.org/10.1038/s41586-021-04268-7

work page doi:10.1038/s41586- 2022
[18]

J. D. Jackson and L. B. Okun. Historical roots of gauge invariance.Rev. Mod. Phys., 73: 663–680, Sep 2001. doi:10.1103/RevModPhys.73.663. URL https://link.aps.org/doi/ 10.1103/RevModPhys.73.663

work page doi:10.1103/revmodphys.73.663 2001
[19]

Horn and Charles R

Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition,

work page
[20]

doi:10.1017/CBO9781139020411

work page doi:10.1017/cbo9781139020411
[21]

Scalable Funding of Bitcoin Micropayment Channel Networks

B.C. Hall.Lie Groups, Lie Algebras, and Representations: An Elementary Introduction. Graduate Texts in Mathematics. Springer, 2003. ISBN 9780387401225. doi:10.1007/978-3- 319-13467-3

work page doi:10.1007/978-3- 2003
[22]

The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024

Luca Maria Del Bono, Flavio Nicoletti, and Federico Ricci-Tersenghi. The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024. doi:10.1371/journal.pone.0313863. URL https://doi.org/10.1371/journal.pone. 0313863

work page doi:10.1371/journal.pone.0313863 2024
[23]

Deep Learning for Classical Japanese Literature

Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex Lamb, Kazuaki Yamamoto, and David Ha. Deep learning for Classical Japanese literature. InWorkshop on Machine Learning for Creativity and Design, NeurIPS 2018, 2018. URLhttps://arxiv.org/abs/1812.01718

work page internal anchor Pith review Pith/arXiv arXiv 2018
[24]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016

work page 2016
[25]

A convnet for the 2020s

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022

work page 2022
[26]

DINOv2: Learning Robust Visual Features without Supervision

Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision.arXiv preprint arXiv:2304.07193, 2023. 12

work page internal anchor Pith review Pith/arXiv arXiv 2023
[27]

Understanding image representations by mea- suring their equivariance and equivalence

Karel Lenc and Andrea Vedaldi. Understanding image representations by mea- suring their equivariance and equivalence. InProceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR), June 2015. URL https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Lenc_ Understanding_Image_Representations_2015_CVPR_paper.html

work page 2015
[28]

Robert-Jan Bruintjes, Tomasz Motyka, and Jan van Gemert. What affects learned equiv- ariance in deep image recognition models? In2023 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition Workshops (CVPRW), pages 4839–4847, 2023. doi:10.1109/CVPRW59228.2023.00512

work page doi:10.1109/cvprw59228.2023.00512 2023
[29]

Exploring the landscape of spatial robustness

Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. Exploring the landscape of spatial robustness. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 1802–1811. PMLR, 09–15 Jun 2019. URL ...

work page 2019
[30]

Andrew Kyle Lampinen, Stephanie C. Y . Chan, Yuxuan Li, and Katherine Hermann. Represen- tation biases: will we achieve complete understanding by analyzing representations?arXiv,

work page
[31]

URLhttps://arxiv.org/abs/2507.22216

work page arXiv
[32]

Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan

Jacob A. Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan. Summary statistics of learning link changing neural representations to behavior.Frontiers in Neural Circuits, 19, 2025. ISSN 1662-5110. doi:10.3389/fncir.2025.1618351. URL https://www.frontiersin.org/ journals/neural-circuits/articles/10.3389/fncir.2025.1618351

work page doi:10.3389/fncir.2025.1618351 2025
[33]

Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019

Michael E Rule, Timothy O’Leary, and Christopher D Harvey. Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019. ISSN 0959-4388. doi:https://doi.org/10.1016/j.conb.2019.08.005. URL https://www.sciencedirect.com/ science/article/pii/S0959438819300303

work page doi:10.1016/j.conb.2019.08.005 2019
[34]

Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022

Paul Masset, Shanshan Qin, and Jacob A Zavatone-Veth. Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022. doi:10.1007/s00422-021-00916-3

work page doi:10.1007/s00422-021-00916-3 2022
[35]

Sengupta, Dmitri B

Shanshan Qin, Shiva Farashahi, David Lipshutz, Anirvan M. Sengupta, Dmitri B. Chklovskii, and Cengiz Pehlevan. Coordinated drift of receptive fields in Hebbian/anti-Hebbian network models during noisy representation learning.Nature Neuroscience, 26(2):339–349, 2023. ISSN 1546-1726. doi:10.1038/s41593-022-01225-z. URL https://doi.org/10.1038/ s41593-022-01225-z

work page doi:10.1038/s41593-022-01225-z 2023
[36]

Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network

Farhad Pashakhanloo and Alexei Koulakov. Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 27401– 27419. PMLR, 2023. URL https://proceedings.mlr.press/v202/pashakhanloo23a. html

work page 2023
[37]

Contribution of task-irrelevant stimuli to drift of neural representations

Farhad Pashakhanloo. Contribution of task-irrelevant stimuli to drift of neural representations. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=jAoqtT58G4

work page 2025
[38]

Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021

Carl E Schoonover, Sarah N Ohashi, Richard Axel, and Andrew JP Fink. Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021. doi:10.1038/s41586-021-03628- 7

work page doi:10.1038/s41586-021-03628- 2021
[39]

Loss landscapes of regularized linear autoencoders

Daniel Kunin, Jonathan Bloom, Aleksandrina Goeva, and Cotton Seed. Loss landscapes of regularized linear autoencoders. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 ofPro- ceedings of Machine Learning Research, pages 3560–3569. PMLR, 09–15 Jun 2019. URL https://proc...

work page 2019
[40]

James C. R. Whittington, Will Dorrell, Surya Ganguli, and Timothy Behrens. Disentanglement with biological constraints: A theory of functional cell types. InThe Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum? id=9Z_GfhZnGH. 13

work page 2023
[41]

Latham, Timothy Edward John Behrens, and James C

Will Dorrell, Kyle Hsu, Luke Hollingsworth, Jin Hwa Lee, Jiajun Wu, Chelsea Finn, Peter E. Latham, Timothy Edward John Behrens, and James C. R. Whittington. Range, not independence, drives modularity in biologically inspired representations. InThe Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum? id=B...

work page 2025
[42]

Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks

Lukas Braun, Erin Grant, and Andrew M Saxe. Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks. InForty- second International Conference on Machine Learning, 2025. URL https://openreview. net/forum?id=YucuAuXMpT

work page 2025
[43]

On privileged and convergent bases in neural network representations.arXiv, 2023

Davis Brown, Nikhil Vyas, and Yamini Bansal. On privileged and convergent bases in neural network representations.arXiv, 2023. URLhttps://arxiv.org/abs/2307.12941

work page arXiv 2023
[44]

Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024

Meenakshi Khosla, Alex H Williams, Josh McDermott, and Nancy Kanwisher. Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024. doi:10.1101/2024.06.20.599957. URL https://www.biorxiv.org/content/early/ 2024/06/20/2024.06.20.599957

work page doi:10.1101/2024.06.20.599957 2024
[45]

Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022

Blake Bordelon and Cengiz Pehlevan. Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022. ISSN 2050-084X. doi:10.7554/eLife.78606. URLhttps://doi.org/10.7554/eLife.78606

work page doi:10.7554/elife.78606 2022
[46]

How does training shape the Riemannian geometry of neural network representations? InNeurIPS 2025 Workshop on Symmetry and Geometry in Neural Representations, 2025

Jacob A Zavatone-Veth, Sheng Yang, Julian Alex Rubinfien, and Cengiz Pehlevan. How does training shape the Riemannian geometry of neural network representations? InNeurIPS 2025 Workshop on Symmetry and Geometry in Neural Representations, 2025. URL https: //openreview.net/forum?id=BaVIDhh7bj

work page 2025
[47]

Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds.arXiv, 2026

N Alex Cayco Gajic and Arthur Pellegrino. Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds.arXiv, 2026. URL https://arxiv. org/abs/2603.28764

work page arXiv 2026
[48]

Kernel methods for deep learning

Youngmin Cho and Lawrence K Saul. Kernel methods for deep learning. In Y . Bengio, D. Schu- urmans, J. Lafferty, C. Williams, and A. Culotta, editors,Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc., 2009. URL https://proceedings. neurips.cc/paper/2009/file/5751ec3e9a4feab575962e78e006250d-Paper.pdf

work page 2009
[49]

Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances

Berfin Simsek, François Ged, Arthur Jacot, Francesco Spadaro, Clement Hongler, Wulfram Gerstner, and Johanni Brea. Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine...

work page 2021
[50]

Semi-flat minima and saddle points by embedding neural networks to overparameterization

Kenji Fukumizu, Shoichiro Yamaguchi, Yoh-ichi Mototake, and Mirai Tanaka. Semi-flat minima and saddle points by embedding neural networks to overparameterization. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, edi- tors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL h...

work page 2019

[1] [1]

Learning lie groups for invariant visual perception

Rajesh Rao and Daniel Ruderman. Learning lie groups for invariant visual perception. In M. Kearns, S. Solla, and D. Cohn, editors,Advances in Neural Information Processing Systems, volume 11. MIT Press, 1998. URL https://proceedings.neurips.cc/paper_files/ paper/1998/file/277281aada22045c03945dcb2ca6f2ec-Paper.pdf

work page 1998

[2] [2]

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478, 2021. URL https://geometricdeeplearning.com/

work page internal anchor Pith review Pith/arXiv arXiv 2021

[3] [3]

Transformation Properties of Learned Visual Representations

Taco S. Cohen and Max Welling. Transformation properties of learned visual representations. InInternational Conference on Learning Representations, 2015. URL https://arxiv.org/ abs/1412.7659

work page internal anchor Pith review Pith/arXiv arXiv 2015

[4] [4]

On the generalization of equivariance and convolution in neural networks to the action of compact groups

Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2747–2755. PMLR, 10–15 Jul 2018. URL https...

work page 2018

[5] [5]

Towards a Definition of Disentangled Representations

Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a definition of disentangled representations.arXiv, 2018. URLhttps://arxiv.org/abs/1812.02230

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B

Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine Hermann, Kerem Oktar, Klaus Greff, Martin N Hebart, Nathan Cloos, Nikolaus Kriegeskorte, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert...

work page 2025

[7] [7]

Position: The pla- tonic representation hypothesis

Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Position: The pla- tonic representation hypothesis. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceed- ings of the 41st International Conference on Machine Learning, volume 235 ofProceed- ings of Machine...

work page 2024

[8] [8]

On the symme- tries of deep learning models and their internal representations

Charles Godfrey, Davis Brown, Tegan Emerson, and Henry Kvinge. On the symme- tries of deep learning models and their internal representations. In S. Koyejo, S. Mo- hamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neu- ral Information Processing Systems, volume 35, pages 11893–11905. Curran Associates, Inc., 2022. URL https://proceedi...

work page 2022

[9] [9]

What representational similarity measures imply about decodable information

Sarah E Harvey, David Lipshutz, and Alex H Williams. What representational similarity measures imply about decodable information. InUniReps: 2nd Edition of the Workshop on Unifying Representations in Neural Models, 2024. URL https://openreview.net/forum? id=hqfzH6GCYj

work page 2024

[10] [10]

Representational similarity analysis-connecting the branches of systems neuroscience.Frontiers in systems neuroscience, 2:249, 2008

Nikolaus Kriegeskorte, Marieke Mur, and Peter A Bandettini. Representational similarity analysis-connecting the branches of systems neuroscience.Frontiers in systems neuroscience, 2:249, 2008

work page 2008

[11] [11]

Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021

Nikolaus Kriegeskorte and Xue-Xin Wei. Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021. ISSN 1471-0048. doi:10.1038/s41583-021- 00502-3. URLhttps://doi.org/10.1038/s41583-021-00502-3. 11

work page doi:10.1038/s41583-021- 2021

[12] [12]

Williams

Alex H. Williams. Equivalence between representational similarity analysis, centered kernel alignment, and canonical correlations analysis.bioRxiv, 2024. doi:10.1101/2024.10.23.619871. URLhttps://www.biorxiv.org/content/early/2024/10/24/2024.10.23.619871

work page doi:10.1101/2024.10.23.619871 2024

[13] [13]

Yena Han, Tomaso A Poggio, and Brian Cheung. System identification of neural systems: If we got it right, would we know? In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Resear...

work page 2023

[14] [14]

Readout representation: Redefining neural codes by input recovery

Shunsuke Onoo, Yoshihiro Nagano, and Yukiyasu Kamitani. Readout representation: Redefining neural codes by input recovery. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=pODHH9DLeA

work page 2026

[15] [15]

D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and func- tional architecture in the cat’s visual cortex.The Journal of Physiology, 160(1):106– 154, 1962. doi:https://doi.org/10.1113/jphysiol.1962.sp006837. URL https://physoc. onlinelibrary.wiley.com/doi/abs/10.1113/jphysiol.1962.sp006837

work page doi:10.1113/jphysiol.1962.sp006837 1962

[16] [16]

Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks

Anirvan Sengupta, Cengiz Pehlevan, Mariano Tepper, Alexander Genkin, and Dmitri Chklovskii. Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 31. Curran Assoc...

work page 2018

[17] [17]

Gupta, Neereja Sundaresan, Thomas Alexander, Christopher J

Richard J. Gardner, Erik Hermansen, Marius Pachitariu, Yoram Burak, Nils A. Baas, Ben- jamin A. Dunn, May-Britt Moser, and Edvard I. Moser. Toroidal topology of population activity in grid cells.Nature, 602(7895):123–128, Feb 2022. ISSN 1476-4687. doi:10.1038/s41586- 021-04268-7. URLhttps://doi.org/10.1038/s41586-021-04268-7

work page doi:10.1038/s41586- 2022

[18] [18]

J. D. Jackson and L. B. Okun. Historical roots of gauge invariance.Rev. Mod. Phys., 73: 663–680, Sep 2001. doi:10.1103/RevModPhys.73.663. URL https://link.aps.org/doi/ 10.1103/RevModPhys.73.663

work page doi:10.1103/revmodphys.73.663 2001

[19] [19]

Horn and Charles R

Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition,

work page

[20] [20]

doi:10.1017/CBO9781139020411

work page doi:10.1017/cbo9781139020411

[21] [21]

Scalable Funding of Bitcoin Micropayment Channel Networks

B.C. Hall.Lie Groups, Lie Algebras, and Representations: An Elementary Introduction. Graduate Texts in Mathematics. Springer, 2003. ISBN 9780387401225. doi:10.1007/978-3- 319-13467-3

work page doi:10.1007/978-3- 2003

[22] [22]

The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024

Luca Maria Del Bono, Flavio Nicoletti, and Federico Ricci-Tersenghi. The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024. doi:10.1371/journal.pone.0313863. URL https://doi.org/10.1371/journal.pone. 0313863

work page doi:10.1371/journal.pone.0313863 2024

[23] [23]

Deep Learning for Classical Japanese Literature

Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex Lamb, Kazuaki Yamamoto, and David Ha. Deep learning for Classical Japanese literature. InWorkshop on Machine Learning for Creativity and Design, NeurIPS 2018, 2018. URLhttps://arxiv.org/abs/1812.01718

work page internal anchor Pith review Pith/arXiv arXiv 2018

[24] [24]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016

work page 2016

[25] [25]

A convnet for the 2020s

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022

work page 2022

[26] [26]

DINOv2: Learning Robust Visual Features without Supervision

Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision.arXiv preprint arXiv:2304.07193, 2023. 12

work page internal anchor Pith review Pith/arXiv arXiv 2023

[27] [27]

Understanding image representations by mea- suring their equivariance and equivalence

Karel Lenc and Andrea Vedaldi. Understanding image representations by mea- suring their equivariance and equivalence. InProceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR), June 2015. URL https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Lenc_ Understanding_Image_Representations_2015_CVPR_paper.html

work page 2015

[28] [28]

Robert-Jan Bruintjes, Tomasz Motyka, and Jan van Gemert. What affects learned equiv- ariance in deep image recognition models? In2023 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition Workshops (CVPRW), pages 4839–4847, 2023. doi:10.1109/CVPRW59228.2023.00512

work page doi:10.1109/cvprw59228.2023.00512 2023

[29] [29]

Exploring the landscape of spatial robustness

Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. Exploring the landscape of spatial robustness. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 1802–1811. PMLR, 09–15 Jun 2019. URL ...

work page 2019

[30] [30]

Andrew Kyle Lampinen, Stephanie C. Y . Chan, Yuxuan Li, and Katherine Hermann. Represen- tation biases: will we achieve complete understanding by analyzing representations?arXiv,

work page

[31] [31]

URLhttps://arxiv.org/abs/2507.22216

work page arXiv

[32] [32]

Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan

Jacob A. Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan. Summary statistics of learning link changing neural representations to behavior.Frontiers in Neural Circuits, 19, 2025. ISSN 1662-5110. doi:10.3389/fncir.2025.1618351. URL https://www.frontiersin.org/ journals/neural-circuits/articles/10.3389/fncir.2025.1618351

work page doi:10.3389/fncir.2025.1618351 2025

[33] [33]

Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019

Michael E Rule, Timothy O’Leary, and Christopher D Harvey. Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019. ISSN 0959-4388. doi:https://doi.org/10.1016/j.conb.2019.08.005. URL https://www.sciencedirect.com/ science/article/pii/S0959438819300303

work page doi:10.1016/j.conb.2019.08.005 2019

[34] [34]

Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022

Paul Masset, Shanshan Qin, and Jacob A Zavatone-Veth. Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022. doi:10.1007/s00422-021-00916-3

work page doi:10.1007/s00422-021-00916-3 2022

[35] [35]

Sengupta, Dmitri B

Shanshan Qin, Shiva Farashahi, David Lipshutz, Anirvan M. Sengupta, Dmitri B. Chklovskii, and Cengiz Pehlevan. Coordinated drift of receptive fields in Hebbian/anti-Hebbian network models during noisy representation learning.Nature Neuroscience, 26(2):339–349, 2023. ISSN 1546-1726. doi:10.1038/s41593-022-01225-z. URL https://doi.org/10.1038/ s41593-022-01225-z

work page doi:10.1038/s41593-022-01225-z 2023

[36] [36]

Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network

Farhad Pashakhanloo and Alexei Koulakov. Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 27401– 27419. PMLR, 2023. URL https://proceedings.mlr.press/v202/pashakhanloo23a. html

work page 2023

[37] [37]

Contribution of task-irrelevant stimuli to drift of neural representations

Farhad Pashakhanloo. Contribution of task-irrelevant stimuli to drift of neural representations. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=jAoqtT58G4

work page 2025

[38] [38]

Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021

Carl E Schoonover, Sarah N Ohashi, Richard Axel, and Andrew JP Fink. Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021. doi:10.1038/s41586-021-03628- 7

work page doi:10.1038/s41586-021-03628- 2021

[39] [39]

Loss landscapes of regularized linear autoencoders

Daniel Kunin, Jonathan Bloom, Aleksandrina Goeva, and Cotton Seed. Loss landscapes of regularized linear autoencoders. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 ofPro- ceedings of Machine Learning Research, pages 3560–3569. PMLR, 09–15 Jun 2019. URL https://proc...

work page 2019

[40] [40]

James C. R. Whittington, Will Dorrell, Surya Ganguli, and Timothy Behrens. Disentanglement with biological constraints: A theory of functional cell types. InThe Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum? id=9Z_GfhZnGH. 13

work page 2023

[41] [41]

Latham, Timothy Edward John Behrens, and James C

Will Dorrell, Kyle Hsu, Luke Hollingsworth, Jin Hwa Lee, Jiajun Wu, Chelsea Finn, Peter E. Latham, Timothy Edward John Behrens, and James C. R. Whittington. Range, not independence, drives modularity in biologically inspired representations. InThe Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum? id=B...

work page 2025

[42] [42]

Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks

Lukas Braun, Erin Grant, and Andrew M Saxe. Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks. InForty- second International Conference on Machine Learning, 2025. URL https://openreview. net/forum?id=YucuAuXMpT

work page 2025

[43] [43]

On privileged and convergent bases in neural network representations.arXiv, 2023

Davis Brown, Nikhil Vyas, and Yamini Bansal. On privileged and convergent bases in neural network representations.arXiv, 2023. URLhttps://arxiv.org/abs/2307.12941

work page arXiv 2023

[44] [44]

Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024

Meenakshi Khosla, Alex H Williams, Josh McDermott, and Nancy Kanwisher. Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024. doi:10.1101/2024.06.20.599957. URL https://www.biorxiv.org/content/early/ 2024/06/20/2024.06.20.599957

work page doi:10.1101/2024.06.20.599957 2024

[45] [45]

Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022

Blake Bordelon and Cengiz Pehlevan. Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022. ISSN 2050-084X. doi:10.7554/eLife.78606. URLhttps://doi.org/10.7554/eLife.78606

work page doi:10.7554/elife.78606 2022

[46] [46]

How does training shape the Riemannian geometry of neural network representations? InNeurIPS 2025 Workshop on Symmetry and Geometry in Neural Representations, 2025

Jacob A Zavatone-Veth, Sheng Yang, Julian Alex Rubinfien, and Cengiz Pehlevan. How does training shape the Riemannian geometry of neural network representations? InNeurIPS 2025 Workshop on Symmetry and Geometry in Neural Representations, 2025. URL https: //openreview.net/forum?id=BaVIDhh7bj

work page 2025

[47] [47]

Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds.arXiv, 2026

N Alex Cayco Gajic and Arthur Pellegrino. Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds.arXiv, 2026. URL https://arxiv. org/abs/2603.28764

work page arXiv 2026

[48] [48]

Kernel methods for deep learning

Youngmin Cho and Lawrence K Saul. Kernel methods for deep learning. In Y . Bengio, D. Schu- urmans, J. Lafferty, C. Williams, and A. Culotta, editors,Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc., 2009. URL https://proceedings. neurips.cc/paper/2009/file/5751ec3e9a4feab575962e78e006250d-Paper.pdf

work page 2009

[49] [49]

Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances

Berfin Simsek, François Ged, Arthur Jacot, Francesco Spadaro, Clement Hongler, Wulfram Gerstner, and Johanni Brea. Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine...

work page 2021

[50] [50]

Semi-flat minima and saddle points by embedding neural networks to overparameterization

Kenji Fukumizu, Shoichiro Yamaguchi, Yoh-ichi Mototake, and Mirai Tanaka. Semi-flat minima and saddle points by embedding neural networks to overparameterization. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, edi- tors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL h...

work page 2019