pith. sign in

arxiv: 2605.21324 · v1 · pith:23JEMS67new · submitted 2026-05-20 · 🧬 q-bio.NC · cs.LG

Stimulus symmetries can confound representational similarity analyses

Pith reviewed 2026-05-21 03:21 UTC · model grok-4.3

classification 🧬 q-bio.NC cs.LG
keywords representational similarityneural codesstimulus symmetriesRSMdrifting codesSGDneural networksrepresentational geometry
0
0 comments X

The pith

Stimulus symmetries can produce different RSMs for functionally equivalent neural representations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that symmetries in the inputs to a network create many internally different but functionally equivalent codes. These codes can generate representational similarity matrices with qualitatively different geometries. Optimization methods like stochastic gradient descent or energetic regularization tend to produce sparse codes whose geometry drifts, so the RSMs drift as well. The same drifting behavior appears in networks trained on natural images even when the symmetry is latent rather than explicit. The result is that RSM-based comparisons become unreliable for nonlinear codes that are not related by a simple rotation.

Core claim

Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs that reflect qualitatively different representational geometries. SGD or energetic regularization generates sparse drifting codes leading to drifting RSMs, present even in image-trained networks with latent symmetry. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.

What carries the argument

The mapping from stimulus symmetries to multiple distinct neural codes that remain functionally equivalent yet yield non-equivalent RSM geometries unless the codes differ only by rotation.

If this is right

  • RSM comparisons can fail to detect that two representations are functionally equivalent under input symmetry.
  • Sparse codes induced by SGD or regularization cause RSMs to drift even after training converges.
  • The effect persists in networks trained on image data where the symmetry is latent.
  • Nonlinear codes not related by rotation cannot be compared reliably with standard RSM methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Analyses of representational geometry in neuroscience may need explicit tests for invariance under possible input transformations.
  • Drifting RSMs could contribute to variability seen across repeated experiments on similar stimuli.
  • Symmetry-aware summary statistics might reduce the confounding without requiring changes to the network itself.

Load-bearing premise

That functionally equivalent representations under stimulus symmetries will yield distinguishable RSMs when the codes are nonlinear and not related by rotation.

What would settle it

Train networks on inputs with known explicit symmetries, then check whether the RSMs stay identical across equivalent codes or continue to drift under SGD.

Figures

Figures reproduced from arXiv: 2605.21324 by Farhad Pashakhanloo, Jacob A. Zavatone-Veth.

Figure 1
Figure 1. Figure 1: Gauge-dependent representations in a toy model. a) Left: RF center vectors (wi) of four neurons tiling a one-dimensional ring. To specify a tiling, one must choose a global orientation φ. Right: corresponding angular tuning curves. b) Representations h1 and h2 of two trial stimuli s1 and s2 in the four-dimensional representation space. The angle between h1 and h2 depends on φ. c) RSMs for two values of the… view at source ↗
Figure 2
Figure 2. Figure 2: Gauge dependence and the geometry of representational manifolds. The schematics show a) the latent space Z = S 1 , and two examples of b) linear and c) nonlinear embeddings of it in R 3 . Points are color-coded by the stimulus angle, and angle zero is marked by star (⋆). The left and right columns have a gauge difference of ∆φ = π/2. The two manifolds in (b) can be transformed into each other using an orth… view at source ↗
Figure 3
Figure 3. Figure 3: Factors influencing RSM gauge dependence. a,b) RSM variability (∆) as a function of number of RFs and tuning width. In (b), the dashed red line shows the prediction from Eqn 12. c) RFs with amplitude noise (additive Gaussian noise with standard deviation 0.1). d) Corresponding φ-dependence of RSM over 100 realizations of noisy RFs (black line: average). We quantify the φ-dependence of the RSM by: ∆ := maxφ… view at source ↗
Figure 4
Figure 4. Figure 4: Collapse of neurons and RSM drift under continual SGD training. a) RSM values over time. b) Entropy of the distribution of cosine of pairwise angles between all neurons’ weights. c) RSMs (top), and the corresponding weight vectors (bottom) at different snapshots during training. Alignment of neurons’ weights (n = 15) into 4 orthogonal directions is evident through learning [PITH_FULL_IMAGE:figures/full_fi… view at source ↗
Figure 6
Figure 6. Figure 6: RSM variability as a result of continual training for three-dimensional input. (Left) Components of the empirical RSM as a function of time. The non-zero off-diagonal elements are grouped into three distinct sets (ρ1, ρ2, and ρ3) based on the predictions derived in Appendix D (this implies entries [0, 1],[0, 4],[1, 3], [3, 4], and their transpose vary as ρ1, and similarly for other groups). Gray curves cor… view at source ↗
Figure 7
Figure 7. Figure 7: RSM variability in autoencoding a rotated image manifold. Autoencoders were trained on rotated versions of a digit from Kuzushiji-MNIST dataset compiled by Clanuwat et al. [22]. a) Four examples of the original and reconstructed images in an instance of the trained model. b) 3D PCA of the hidden layer activations in a trained model in response to all stimuli. Each point is color-coded by the rotation angle… view at source ↗
read the original abstract

What can representational similarity matrices (RSMs) tell us about a neural code? As the popularity of these summary statistics grows, so too does the need for a more complete characterization of their properties. Here, we show that symmetries in network inputs can confound RSM-based analyses. Stimulus symmetries render many representations functionally equivalent, but these different configurations can lead to different RSMs. These different RSMs reflect qualitatively different representational geometries. We show that stochastic gradient descent or energetic regularization can generate sparse, drifting codes, leading in turn to drifting RSMs. Moreover, we demonstrate that these phenomena are present in networks trained to encode image data, where the symmetry is latent. Our results illustrate the challenges inherent in comparing nonlinear neural codes, when functionally-equivalent representations are not related by a simple rotation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that stimulus symmetries render many neural representations functionally equivalent, yet these can yield qualitatively different representational similarity matrices (RSMs) that reflect distinct geometries. It further argues that SGD or energetic regularization produces sparse drifting codes that cause drifting RSMs, and demonstrates the effect even in image-trained networks where the symmetry is latent. The central conclusion is that RSM-based comparisons of nonlinear codes are confounded when functionally equivalent representations are not related by simple rotations.

Significance. If the core observations hold after verification, the result would be moderately significant for the RSA literature in systems neuroscience and machine learning. It would highlight a previously under-appreciated confound arising from input symmetries and optimization dynamics, potentially motivating more rigorous controls when RSMs are used to compare representations across networks or conditions. The demonstration in latent-symmetry image networks is a strength if the symmetry group and functional equivalence are properly isolated.

major comments (2)
  1. [Abstract / central claim] The central claim requires explicit verification that the observed codes differ by a symmetry-induced mapping (rather than arbitrary nonlinear reparameterization) while preserving task performance. The skeptic note correctly identifies that without a section or equation defining functional equivalence under the group action and showing invariance of task metrics, the confounding effect of symmetries is not isolated from other sources of representational drift.
  2. [Abstract / results description] The abstract describes demonstrations of drifting RSMs under SGD/regularization but provides no details on controls, error bars, or how drifting is quantified versus baseline variability. This is load-bearing for the claim that drifting codes are generated by regularization rather than other factors; a methods or results section should report these quantifications.
minor comments (1)
  1. [Methods] Clarify how symmetries are detected or controlled in the training data and architecture, particularly for the latent-symmetry image case where the symmetry group is not enumerated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our manuscript. We address each major comment below and indicate the revisions made.

read point-by-point responses
  1. Referee: [Abstract / central claim] The central claim requires explicit verification that the observed codes differ by a symmetry-induced mapping (rather than arbitrary nonlinear reparameterization) while preserving task performance. The skeptic note correctly identifies that without a section or equation defining functional equivalence under the group action and showing invariance of task metrics, the confounding effect of symmetries is not isolated from other sources of representational drift.

    Authors: We agree that making the definition of functional equivalence explicit strengthens the manuscript. In the revised version, we have added a dedicated subsection in the Methods titled 'Defining Functional Equivalence under Group Actions' that includes the mathematical definition of the symmetry group acting on stimuli and the induced mapping on representations. We show that task performance metrics remain invariant under these mappings by proving that the output of the network is unchanged when inputs are transformed by the symmetry group. This isolates the effect from arbitrary nonlinear reparameterizations, as the mappings are explicitly constructed from the group action. revision: yes

  2. Referee: [Abstract / results description] The abstract describes demonstrations of drifting RSMs under SGD/regularization but provides no details on controls, error bars, or how drifting is quantified versus baseline variability. This is load-bearing for the claim that drifting codes are generated by regularization rather than other factors; a methods or results section should report these quantifications.

    Authors: We appreciate this point and have expanded the manuscript accordingly. The revised Methods section now details the controls used, including comparisons to networks trained without regularization and with different random seeds. Drifting is quantified using the average pairwise distance between RSMs computed at different training epochs, with error bars representing standard deviation across 10 independent runs. We also include a baseline variability measure from fixed codes. These details are now reported in the Results section as well, with a new figure panel illustrating the quantification. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical demonstrations of symmetry effects on RSMs are self-contained

full rationale

The paper presents its core results as outcomes of explicit simulations and network training experiments (SGD/regularization producing sparse drifting codes and drifting RSMs, including in image-trained networks with latent symmetry). These are shown via concrete examples of different representational geometries arising from functionally equivalent codes, without any load-bearing step that reduces by definition, by fitted-parameter renaming, or by self-citation chain to the target claim. The abstract and described results treat the confounding as an observed phenomenon rather than a derived identity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the central claim implicitly assumes that functional equivalence under symmetry is well-defined and that RSM differences arise solely from representational geometry rather than measurement artifacts.

pith-pipeline@v0.9.0 · 5667 in / 1119 out tokens · 25039 ms · 2026-05-21T03:21:13.336796+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 5 internal anchors

  1. [1]

    Learning lie groups for invariant visual perception

    Rajesh Rao and Daniel Ruderman. Learning lie groups for invariant visual perception. In M. Kearns, S. Solla, and D. Cohn, editors,Advances in Neural Information Processing Systems, volume 11. MIT Press, 1998. URL https://proceedings.neurips.cc/paper_files/ paper/1998/file/277281aada22045c03945dcb2ca6f2ec-Paper.pdf

  2. [2]

    Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

    Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi´c. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478, 2021. URL https://geometricdeeplearning.com/

  3. [3]

    Transformation Properties of Learned Visual Representations

    Taco S. Cohen and Max Welling. Transformation properties of learned visual representations. InInternational Conference on Learning Representations, 2015. URL https://arxiv.org/ abs/1412.7659

  4. [4]

    On the generalization of equivariance and convolution in neural networks to the action of compact groups

    Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In Jennifer Dy and Andreas Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2747–2755. PMLR, 10–15 Jul 2018. URL https...

  5. [5]

    Towards a Definition of Disentangled Representations

    Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a definition of disentangled representations.arXiv, 2018. URLhttps://arxiv.org/abs/1812.02230

  6. [6]

    Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B

    Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Christopher J Cueva, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine Hermann, Kerem Oktar, Klaus Greff, Martin N Hebart, Nathan Cloos, Nikolaus Kriegeskorte, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert...

  7. [7]

    Position: The pla- tonic representation hypothesis

    Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Position: The pla- tonic representation hypothesis. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceed- ings of the 41st International Conference on Machine Learning, volume 235 ofProceed- ings of Machine...

  8. [8]

    On the symme- tries of deep learning models and their internal representations

    Charles Godfrey, Davis Brown, Tegan Emerson, and Henry Kvinge. On the symme- tries of deep learning models and their internal representations. In S. Koyejo, S. Mo- hamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neu- ral Information Processing Systems, volume 35, pages 11893–11905. Curran Associates, Inc., 2022. URL https://proceedi...

  9. [9]

    What representational similarity measures imply about decodable information

    Sarah E Harvey, David Lipshutz, and Alex H Williams. What representational similarity measures imply about decodable information. InUniReps: 2nd Edition of the Workshop on Unifying Representations in Neural Models, 2024. URL https://openreview.net/forum? id=hqfzH6GCYj

  10. [10]

    Representational similarity analysis-connecting the branches of systems neuroscience.Frontiers in systems neuroscience, 2:249, 2008

    Nikolaus Kriegeskorte, Marieke Mur, and Peter A Bandettini. Representational similarity analysis-connecting the branches of systems neuroscience.Frontiers in systems neuroscience, 2:249, 2008

  11. [11]

    Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021

    Nikolaus Kriegeskorte and Xue-Xin Wei. Neural tuning and representational geometry.Nature Reviews Neuroscience, 22(11):703–718, Nov 2021. ISSN 1471-0048. doi:10.1038/s41583-021- 00502-3. URLhttps://doi.org/10.1038/s41583-021-00502-3. 11

  12. [12]

    Williams

    Alex H. Williams. Equivalence between representational similarity analysis, centered kernel alignment, and canonical correlations analysis.bioRxiv, 2024. doi:10.1101/2024.10.23.619871. URLhttps://www.biorxiv.org/content/early/2024/10/24/2024.10.23.619871

  13. [13]

    Yena Han, Tomaso A Poggio, and Brian Cheung. System identification of neural systems: If we got it right, would we know? In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Resear...

  14. [14]

    Readout representation: Redefining neural codes by input recovery

    Shunsuke Onoo, Yoshihiro Nagano, and Yukiyasu Kamitani. Readout representation: Redefining neural codes by input recovery. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=pODHH9DLeA

  15. [15]

    D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and func- tional architecture in the cat’s visual cortex.The Journal of Physiology, 160(1):106– 154, 1962. doi:https://doi.org/10.1113/jphysiol.1962.sp006837. URL https://physoc. onlinelibrary.wiley.com/doi/abs/10.1113/jphysiol.1962.sp006837

  16. [16]

    Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks

    Anirvan Sengupta, Cengiz Pehlevan, Mariano Tepper, Alexander Genkin, and Dmitri Chklovskii. Manifold-tiling localized receptive fields are optimal in similarity-preserving neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 31. Curran Assoc...

  17. [17]

    Gupta, Neereja Sundaresan, Thomas Alexander, Christopher J

    Richard J. Gardner, Erik Hermansen, Marius Pachitariu, Yoram Burak, Nils A. Baas, Ben- jamin A. Dunn, May-Britt Moser, and Edvard I. Moser. Toroidal topology of population activity in grid cells.Nature, 602(7895):123–128, Feb 2022. ISSN 1476-4687. doi:10.1038/s41586- 021-04268-7. URLhttps://doi.org/10.1038/s41586-021-04268-7

  18. [18]

    J. D. Jackson and L. B. Okun. Historical roots of gauge invariance.Rev. Mod. Phys., 73: 663–680, Sep 2001. doi:10.1103/RevModPhys.73.663. URL https://link.aps.org/doi/ 10.1103/RevModPhys.73.663

  19. [19]

    Horn and Charles R

    Roger A. Horn and Charles R. Johnson.Matrix Analysis. Cambridge University Press, 2 edition,

  20. [20]

    doi:10.1017/CBO9781139020411

  21. [21]

    Scalable Funding of Bitcoin Micropayment Channel Networks

    B.C. Hall.Lie Groups, Lie Algebras, and Representations: An Elementary Introduction. Graduate Texts in Mathematics. Springer, 2003. ISBN 9780387401225. doi:10.1007/978-3- 319-13467-3

  22. [22]

    The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024

    Luca Maria Del Bono, Flavio Nicoletti, and Federico Ricci-Tersenghi. The most uniform distribution of points on the sphere.PLOS ONE, 19(12):1–24, 12 2024. doi:10.1371/journal.pone.0313863. URL https://doi.org/10.1371/journal.pone. 0313863

  23. [23]

    Deep Learning for Classical Japanese Literature

    Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex Lamb, Kazuaki Yamamoto, and David Ha. Deep learning for Classical Japanese literature. InWorkshop on Machine Learning for Creativity and Design, NeurIPS 2018, 2018. URLhttps://arxiv.org/abs/1812.01718

  24. [24]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016

  25. [25]

    A convnet for the 2020s

    Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022

  26. [26]

    DINOv2: Learning Robust Visual Features without Supervision

    Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision.arXiv preprint arXiv:2304.07193, 2023. 12

  27. [27]

    Understanding image representations by mea- suring their equivariance and equivalence

    Karel Lenc and Andrea Vedaldi. Understanding image representations by mea- suring their equivariance and equivalence. InProceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR), June 2015. URL https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Lenc_ Understanding_Image_Representations_2015_CVPR_paper.html

  28. [28]

    Robert-Jan Bruintjes, Tomasz Motyka, and Jan van Gemert. What affects learned equiv- ariance in deep image recognition models? In2023 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition Workshops (CVPRW), pages 4839–4847, 2023. doi:10.1109/CVPRW59228.2023.00512

  29. [29]

    Exploring the landscape of spatial robustness

    Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. Exploring the landscape of spatial robustness. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 1802–1811. PMLR, 09–15 Jun 2019. URL ...

  30. [30]

    Andrew Kyle Lampinen, Stephanie C. Y . Chan, Yuxuan Li, and Katherine Hermann. Represen- tation biases: will we achieve complete understanding by analyzing representations?arXiv,

  31. [31]

    URLhttps://arxiv.org/abs/2507.22216

  32. [32]

    Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan

    Jacob A. Zavatone-Veth, Blake Bordelon, and Cengiz Pehlevan. Summary statistics of learning link changing neural representations to behavior.Frontiers in Neural Circuits, 19, 2025. ISSN 1662-5110. doi:10.3389/fncir.2025.1618351. URL https://www.frontiersin.org/ journals/neural-circuits/articles/10.3389/fncir.2025.1618351

  33. [33]

    Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019

    Michael E Rule, Timothy O’Leary, and Christopher D Harvey. Causes and consequences of representational drift.Current Opinion in Neurobiology, 58:141–147, 2019. ISSN 0959-4388. doi:https://doi.org/10.1016/j.conb.2019.08.005. URL https://www.sciencedirect.com/ science/article/pii/S0959438819300303

  34. [34]

    Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022

    Paul Masset, Shanshan Qin, and Jacob A Zavatone-Veth. Drifting neuronal representations: Bug or feature?Biological Cybernetics, pages 1–14, 2022. doi:10.1007/s00422-021-00916-3

  35. [35]

    Sengupta, Dmitri B

    Shanshan Qin, Shiva Farashahi, David Lipshutz, Anirvan M. Sengupta, Dmitri B. Chklovskii, and Cengiz Pehlevan. Coordinated drift of receptive fields in Hebbian/anti-Hebbian network models during noisy representation learning.Nature Neuroscience, 26(2):339–349, 2023. ISSN 1546-1726. doi:10.1038/s41593-022-01225-z. URL https://doi.org/10.1038/ s41593-022-01225-z

  36. [36]

    Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network

    Farhad Pashakhanloo and Alexei Koulakov. Stochastic gradient descent-induced drift of repre- sentation in a two-layer neural network. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 27401– 27419. PMLR, 2023. URL https://proceedings.mlr.press/v202/pashakhanloo23a. html

  37. [37]

    Contribution of task-irrelevant stimuli to drift of neural representations

    Farhad Pashakhanloo. Contribution of task-irrelevant stimuli to drift of neural representations. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https://openreview.net/forum?id=jAoqtT58G4

  38. [38]

    Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021

    Carl E Schoonover, Sarah N Ohashi, Richard Axel, and Andrew JP Fink. Representational drift in primary olfactory cortex.Nature, 594(7864):541–546, 2021. doi:10.1038/s41586-021-03628- 7

  39. [39]

    Loss landscapes of regularized linear autoencoders

    Daniel Kunin, Jonathan Bloom, Aleksandrina Goeva, and Cotton Seed. Loss landscapes of regularized linear autoencoders. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 ofPro- ceedings of Machine Learning Research, pages 3560–3569. PMLR, 09–15 Jun 2019. URL https://proc...

  40. [40]

    James C. R. Whittington, Will Dorrell, Surya Ganguli, and Timothy Behrens. Disentanglement with biological constraints: A theory of functional cell types. InThe Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum? id=9Z_GfhZnGH. 13

  41. [41]

    Latham, Timothy Edward John Behrens, and James C

    Will Dorrell, Kyle Hsu, Luke Hollingsworth, Jin Hwa Lee, Jiajun Wu, Chelsea Finn, Peter E. Latham, Timothy Edward John Behrens, and James C. R. Whittington. Range, not independence, drives modularity in biologically inspired representations. InThe Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum? id=B...

  42. [42]

    Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks

    Lukas Braun, Erin Grant, and Andrew M Saxe. Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks. InForty- second International Conference on Machine Learning, 2025. URL https://openreview. net/forum?id=YucuAuXMpT

  43. [43]

    On privileged and convergent bases in neural network representations.arXiv, 2023

    Davis Brown, Nikhil Vyas, and Yamini Bansal. On privileged and convergent bases in neural network representations.arXiv, 2023. URLhttps://arxiv.org/abs/2307.12941

  44. [44]

    Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024

    Meenakshi Khosla, Alex H Williams, Josh McDermott, and Nancy Kanwisher. Privi- leged representational axes in biological and artificial neural networks.bioRxiv, 2024. doi:10.1101/2024.06.20.599957. URL https://www.biorxiv.org/content/early/ 2024/06/20/2024.06.20.599957

  45. [45]

    Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022

    Blake Bordelon and Cengiz Pehlevan. Population codes enable learning from few examples by shaping inductive bias.eLife, 11:e78606, 2022. ISSN 2050-084X. doi:10.7554/eLife.78606. URLhttps://doi.org/10.7554/eLife.78606

  46. [46]

    How does training shape the Riemannian geometry of neural network representations? InNeurIPS 2025 Workshop on Symmetry and Geometry in Neural Representations, 2025

    Jacob A Zavatone-Veth, Sheng Yang, Julian Alex Rubinfien, and Cengiz Pehlevan. How does training shape the Riemannian geometry of neural network representations? InNeurIPS 2025 Workshop on Symmetry and Geometry in Neural Representations, 2025. URL https: //openreview.net/forum?id=BaVIDhh7bj

  47. [47]

    Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds.arXiv, 2026

    N Alex Cayco Gajic and Arthur Pellegrino. Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds.arXiv, 2026. URL https://arxiv. org/abs/2603.28764

  48. [48]

    Kernel methods for deep learning

    Youngmin Cho and Lawrence K Saul. Kernel methods for deep learning. In Y . Bengio, D. Schu- urmans, J. Lafferty, C. Williams, and A. Culotta, editors,Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc., 2009. URL https://proceedings. neurips.cc/paper/2009/file/5751ec3e9a4feab575962e78e006250d-Paper.pdf

  49. [49]

    Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances

    Berfin Simsek, François Ged, Arthur Jacot, Francesco Spadaro, Clement Hongler, Wulfram Gerstner, and Johanni Brea. Geometry of the loss landscape in overparameterized neural networks: Symmetries and invariances. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine...

  50. [50]

    Semi-flat minima and saddle points by embedding neural networks to overparameterization

    Kenji Fukumizu, Shoichiro Yamaguchi, Yoh-ichi Mototake, and Mirai Tanaka. Semi-flat minima and saddle points by embedding neural networks to overparameterization. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, edi- tors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL h...