SEIS: Subspace-based Equivariance and Invariance Scores for Neural Representations
Pith reviewed 2026-05-16 07:36 UTC · model grok-4.3
The pith
SEIS disentangles equivariance from invariance in neural layers using subspace metrics on feature transformations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SEIS is a subspace metric for layer-wise feature representations under geometric transformations that disentangles equivariance from invariance without requiring labels or explicit knowledge of the transformation. Convolutional encoders exhibit a depth-wise transition from strong equivariance to increasing invariance that stabilizes within the first few training epochs. In segmentation decoders equivariance tends to recover in later layers. Data augmentation strengthens both properties simultaneously, and multi-task learning produces synergistic gains beyond single-task training. Transformer-based models display distinct geometric behaviors while MLP-Mixers fall between CNN and transformer特性
What carries the argument
The SEIS subspace metric, which projects transformed and untransformed features into subspaces to isolate and quantify their equivariant versus invariant components.
If this is right
- Convolutional encoders shift from equivariant early layers to invariant later layers.
- Both equivariance and invariance stabilize after only a few training epochs.
- Data augmentation increases both equivariance and invariance at once.
- Multi-task learning yields higher scores in both properties than either task alone.
- Transformers show different layer-wise geometric patterns than convolutional networks.
Where Pith is reading between the lines
- SEIS could guide architecture choices to preserve equivariance in specific layers for tasks like segmentation.
- The metric might be applied to non-geometric transformations such as color shifts or noise to test broader robustness.
- Comparing SEIS scores before and after fine-tuning could reveal how transfer learning alters geometric structure in representations.
Load-bearing premise
The subspace construction and chosen basis reliably separate equivariance from invariance for arbitrary geometric transformations and network architectures.
What would settle it
Running SEIS on a group-equivariant convolutional network known to maintain perfect equivariance across all layers and observing low equivariance scores would falsify the metric's separation claim.
read the original abstract
Understanding how neural representations respond to geometric transformations is essential for evaluating whether learned features preserve meaningful spatial structure. Existing approaches primarily assess robustness primarily by comparing model outputs under transformed inputs, offering limited insight into how geometric information is organized within internal representations and failing to distinguish between information loss and re-encoding. In this work, we introduce SEIS (Subspace-based Equivariance and Invariance Scores), a subspace metric for analyzing layer-wise feature representations under geometric transformations, disentangling equivariance from invariance without requiring labels or explicit knowledge of the transformation. Through controlled experiments across diverse architectures, we uncover several consistent patterns. First, convolutional encoders exhibit a depth-wise transition from strong equivariance to increasing invariance, with both properties stabilizing within the first few training epochs. In segmentation decoders, however, equivariance tends to recover in later layers. Second, this trade-off is not intrinsic but is shaped by training decisions: data augmentation actively strengthens both equivariance and invariance simultaneously, and multi-task learning induces synergistic gains in both properties beyond what either task achieves alone. Extending our analysis beyond convolutional networks, we find that transformer-based models exhibit distinct geometric behaviors, while MLP-Mixers display intermediate characteristics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SEIS (Subspace-based Equivariance and Invariance Scores), a metric that extracts subspaces from layer activations under geometric transformations to separately quantify equivariance and invariance in neural representations. It applies SEIS to CNN encoders/decoders, transformers, and MLP-Mixers, reporting a depth-wise shift from equivariance to invariance (stabilizing early in training), equivariance recovery in segmentation decoders, simultaneous strengthening of both properties under data augmentation, synergistic gains from multi-task learning, and architecture-specific geometric behaviors.
Significance. If the subspace separation is robust, SEIS provides a label-free, transformation-agnostic tool for dissecting internal geometric structure that output-based robustness metrics cannot access. The reported training and architecture patterns could inform design choices for spatially structured tasks, though the significance hinges on whether the scores reflect intrinsic representation geometry rather than subspace-construction artifacts.
major comments (2)
- [§3.2] §3.2 (SEIS definition): the equivariance and invariance scores are obtained by projecting activations onto subspaces derived from SVD of stacked transformed features; the paper does not specify a data-independent rule for selecting subspace dimension k or basis rank, so the claimed disentanglement may vary with this choice and with data distribution, directly affecting the depth-wise transition and augmentation results.
- [§4.1–4.3] §4.1–4.3 (experiments): the reported patterns (e.g., early stabilization within first few epochs, synergistic multi-task gains) are shown only for the chosen architectures and transformations; no ablation varies the subspace rank or basis construction method, leaving open whether the CNN-to-transformer differences and decoder recovery are stable under alternative subspace selections.
minor comments (2)
- [Figure 2] Figure 2 caption and §4.2 text: the y-axis scaling for invariance scores is not stated explicitly, making it hard to compare absolute values across architectures.
- [§2] §2 (related work): the discussion of prior equivariance metrics omits recent subspace-based approaches in representation geometry; adding 2–3 key citations would clarify novelty.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments. The feedback highlights important aspects of methodological clarity and experimental robustness that we will address in the revision. Below we respond point by point to the major comments.
read point-by-point responses
-
Referee: [§3.2] §3.2 (SEIS definition): the equivariance and invariance scores are obtained by projecting activations onto subspaces derived from SVD of stacked transformed features; the paper does not specify a data-independent rule for selecting subspace dimension k or basis rank, so the claimed disentanglement may vary with this choice and with data distribution, directly affecting the depth-wise transition and augmentation results.
Authors: We agree that an explicit, reproducible rule for selecting the subspace dimension k is necessary. The original manuscript leaves this choice implicit. In the revised version we will define k as the minimal dimension retaining at least 95% of the cumulative explained variance from the SVD of the stacked transformed activations. This is a standard, data-driven threshold that remains consistent across datasets and architectures. We will also add a sensitivity analysis (new paragraph in §3.2 and supplementary figures) showing that the reported depth-wise transition, early stabilization, and augmentation effects remain qualitatively unchanged for thresholds between 90% and 99%. revision: yes
-
Referee: [§4.1–4.3] §4.1–4.3 (experiments): the reported patterns (e.g., early stabilization within first few epochs, synergistic multi-task gains) are shown only for the chosen architectures and transformations; no ablation varies the subspace rank or basis construction method, leaving open whether the CNN-to-transformer differences and decoder recovery are stable under alternative subspace selections.
Authors: We acknowledge that the current experiments do not include ablations on subspace rank or basis construction. In the revision we will add a dedicated ablation subsection to §4 that (i) varies k over a wide range for the CNN encoder, transformer, and decoder settings and (ii) compares the SVD-based stacked-feature basis against PCA on individual transformations and random projections. Results will be reported for the key patterns (early stabilization, multi-task synergy, decoder recovery, and architecture differences) to demonstrate stability. These additions will appear both in the main text and supplementary tables. revision: yes
Circularity Check
SEIS introduced as direct subspace definition with no reduction to inputs or self-citations
full rationale
The paper defines SEIS explicitly as a subspace metric computed from layer activations under geometric transformations, with the central claim being its ability to disentangle equivariance from invariance by construction of the subspaces. No equations or steps in the provided description reduce a claimed prediction or result back to fitted parameters, self-citations, or ansatzes that presuppose the outcome. The method is presented as an analysis tool rather than a derived theorem, and the abstract emphasizes independence from labels or explicit transformation knowledge without invoking prior self-referential results. This makes the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We apply CCA to the denoised subspaces... Sequiv = mean absolute cosine similarity between canonical variates; Sinv weighted by canonical correlations
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
SEIS... disentangling equivariance from invariance without... explicit knowledge of the transformation
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
INTRODUCTION The remarkable success of deep learning across computer vi- sion, physics simulation, and molecular biology is largely attributed to the effective utilization of inductive biases [1]. Convolutional Neural Networks (CNNs) integrate translation equivariance to handle spatial data efficiently [2], while re- cent advances in Group equivariant CNN...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[2]
RELA TED WORK Quantifying the response of neural networks to geometric transformations is crucial for evaluating whether learned representations behave in a structured manner. Although specialized architectures such as G-CNNs [3] theoretically enforce equivariance, empirical measurement remains es- sential, as these guarantees often degrade in practice du...
-
[3]
METHOD Rationale.While the theoretical definition of equivariance permits arbitrary non-linear transformations, spatial trans- formations on grid-structured data act as linear operators on the feature space. Furthermore, modern architectures process these features via linear convolutions and element-wise non- linearities that preserve local topology. Cons...
-
[4]
EXPERIMENTS 4.1. Validation on Synthetic Transformations Experimental Setup.We validate SEIS using a controlled synthetic setup that isolates geometric effects from training dynamics. Activations are extracted from a single convolu- tional layer on MNIST, and paired representations are con- structed by directly applying spatial transformations to the refe...
-
[5]
pared to≈0.3in the non-augmented model
Augmentation increases invariance while preserving equivariance in deeper layers. pared to≈0.3in the non-augmented model. These results in- dicate that affine augmentation promotes representations that are more transformation-tolerant while preserving equivari- ant structure, consistent with empirical observations based on Pearson correlation measures [8]...
work page 2012
-
[6]
CONCLUSIONS We presented SEIS, a subspace-based metric for dissecting the geometric behavior of neural representations by disentan- gling equivariance from invariance. Our experiments show that the equivariance–invariance trade-off is not an inherent property of network architecture but is actively shaped by training choices, including augmentation, task ...
-
[7]
Geometric deep learning: going beyond euclidean data,
Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst, “Geometric deep learning: going beyond euclidean data,”IEEE Signal Processing Magazine, vol. 34, no. 4, pp. 18–42, 2017
work page 2017
-
[8]
Backpropagation applied to hand- written zip code recognition,
Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, and Lawrence D Jackel, “Backpropagation applied to hand- written zip code recognition,”Neural Computation, vol. 1, no. 4, pp. 541–551, 1989
work page 1989
-
[9]
Group equivariant con- volutional networks,
Taco Cohen and Max Welling, “Group equivariant con- volutional networks,” inInternational conference on machine learning. PMLR, 2016, pp. 2990–2999
work page 2016
-
[10]
General E(2)- equivariant steerable CNNs,
Maurice Weiler and Gabriele Cesa, “General E(2)- equivariant steerable CNNs,”Advances in Neural In- formation Processing Systems, vol. 32, 2019
work page 2019
-
[11]
Why do deep convo- lutional networks generalize so poorly to small image transformations?,
Aharon Azulay and Yair Weiss, “Why do deep convo- lutional networks generalize so poorly to small image transformations?,”Journal of Machine Learning Re- search, vol. 20, no. 184, pp. 1–25, 2019
work page 2019
-
[12]
Measuring invariances in deep net- works,
Ian Goodfellow, Honglak Lee, Quoc Le, Andrew Saxe, and Andrew Ng, “Measuring invariances in deep net- works,”Advances in Neural Information Processing Systems, vol. 22, 2009
work page 2009
-
[13]
In what ways are deep neural networks invariant and how should we measure this?,
Henry Kvinge, Tegan Emerson, Grayson Jorgenson, Scott Vasquez, Tim Doster, and Jesse Lew, “In what ways are deep neural networks invariant and how should we measure this?,” inAdvances in Neural Information Processing Systems, 2022, vol. 35, pp. 32816–32829
work page 2022
-
[14]
What affects learned equivariance in deep image recognition models?,
Robert-Jan Bruintjes, Tomasz Motyka, and Jan van Gemert, “What affects learned equivariance in deep image recognition models?,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4839–4847
work page 2023
-
[15]
In- variance measures for neural networks,
Facundo Manuel Quiroga, Jordina Torrents-Barrena, Laura Cristina Lanzarini, and Domenec Puig-Valls, “In- variance measures for neural networks,”Applied Soft Computing, vol. 132, pp. 109817, 2023
work page 2023
-
[16]
Maithra Raghu, Justin Gilmer, Jason Yosinski, and Jascha Sohl-Dickstein, “Svcca: Singular vector canoni- cal correlation analysis for deep learning dynamics and interpretability,”Advances in Neural Information Pro- cessing Systems, vol. 30, 2017
work page 2017
-
[17]
Similarity of neural network rep- resentations revisited,
Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton, “Similarity of neural network rep- resentations revisited,” inInternational Conference on Machine Learning. PMlR, 2019, pp. 3519–3529
work page 2019
-
[18]
Learning invariances in neural net- works from training data,
Gregory Benton, Marc Finzi, Pavel Izmailov, and An- drew G Wilson, “Learning invariances in neural net- works from training data,”Advances in Neural Infor- mation Processing Systems, vol. 33, pp. 17605–17616, 2020
work page 2020
-
[19]
Benchmarking neural network robustness to common corruptions and perturbations,
Dan Hendrycks and Thomas Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,”International Conference on Learning Representations, 2019
work page 2019
-
[20]
On the strong correlation between model invariance and generalization,
Weijian Deng, Stephen Gould, and Liang Zheng, “On the strong correlation between model invariance and generalization,”Advances in Neural Information Pro- cessing Systems, vol. 35, pp. 28052–28067, 2022
work page 2022
-
[21]
Shortcut learning in deep neural networks,
Robert Geirhos, J ¨orn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A Wichmann, “Shortcut learning in deep neural networks,”Nature Machine Intelligence, vol. 2, no. 11, pp. 665–673, 2020
work page 2020
-
[22]
Understanding im- age representations by measuring their equivariance and equivalence,
Karel Lenc and Andrea Vedaldi, “Understanding im- age representations by measuring their equivariance and equivalence,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 991–999
work page 2015
-
[23]
Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al., “Spatial transformer networks,”Advances in Neu- ral Information Processing Systems, vol. 28, 2015
work page 2015
-
[24]
Dynamic routing between capsules,
Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton, “Dynamic routing between capsules,”Advances in Neu- ral Information Processing Systems, vol. 30, 2017
work page 2017
-
[25]
The Lie derivative for measur- ing learned equivariance,
Nate Gruver, Marc Finzi, Micah Goldblum, and An- drew Gordon Wilson, “The Lie derivative for measur- ing learned equivariance,”International Conference on Learning Representations, 2022
work page 2022
-
[26]
Grounding representation similarity through sta- tistical testing,
Frances Ding, Jean-Stanislas Denain, and Jacob Stein- hardt, “Grounding representation similarity through sta- tistical testing,”Advances in Neural Information Pro- cessing Systems, vol. 34, pp. 1556–1568, 2021
work page 2021
-
[27]
U-net: Convolutional networks for biomedical image segmentation,
Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “U-net: Convolutional networks for biomedical image segmentation,” inInternational Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 2015, pp. 234–241
work page 2015
-
[28]
Multi-task learning for dense prediction tasks: A survey,
Simon Vandenhende, Stamatios Georgoulis, Wouter Van Gansbeke, Marc Proesmans, Dengxin Dai, and Luc Van Gool, “Multi-task learning for dense prediction tasks: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3614–3633, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.