pith. sign in

arxiv: 2602.04054 · v2 · submitted 2026-02-03 · 💻 cs.LG · cs.CV

SEIS: Subspace-based Equivariance and Invariance Scores for Neural Representations

Pith reviewed 2026-05-16 07:36 UTC · model grok-4.3

classification 💻 cs.LG cs.CV
keywords equivarianceinvarianceneural representationsgeometric transformationssubspace metricconvolutional networkstransformersfeature analysis
0
0 comments X

The pith

SEIS disentangles equivariance from invariance in neural layers using subspace metrics on feature transformations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SEIS as a label-free subspace metric that measures how internal representations in neural networks respond to geometric transformations. It separates the preservation of transformation structure (equivariance) from the loss of sensitivity to it (invariance) at each layer. This separation matters because prior approaches only compared final outputs and could not tell whether geometric information was discarded or merely re-encoded. Experiments across architectures show convolutional encoders move from strong equivariance in early layers to greater invariance deeper in the network, with both properties stabilizing after a few training epochs. Segmentation decoders recover equivariance in later stages, while data augmentation and multi-task training boost both scores together.

Core claim

SEIS is a subspace metric for layer-wise feature representations under geometric transformations that disentangles equivariance from invariance without requiring labels or explicit knowledge of the transformation. Convolutional encoders exhibit a depth-wise transition from strong equivariance to increasing invariance that stabilizes within the first few training epochs. In segmentation decoders equivariance tends to recover in later layers. Data augmentation strengthens both properties simultaneously, and multi-task learning produces synergistic gains beyond single-task training. Transformer-based models display distinct geometric behaviors while MLP-Mixers fall between CNN and transformer特性

What carries the argument

The SEIS subspace metric, which projects transformed and untransformed features into subspaces to isolate and quantify their equivariant versus invariant components.

If this is right

  • Convolutional encoders shift from equivariant early layers to invariant later layers.
  • Both equivariance and invariance stabilize after only a few training epochs.
  • Data augmentation increases both equivariance and invariance at once.
  • Multi-task learning yields higher scores in both properties than either task alone.
  • Transformers show different layer-wise geometric patterns than convolutional networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • SEIS could guide architecture choices to preserve equivariance in specific layers for tasks like segmentation.
  • The metric might be applied to non-geometric transformations such as color shifts or noise to test broader robustness.
  • Comparing SEIS scores before and after fine-tuning could reveal how transfer learning alters geometric structure in representations.

Load-bearing premise

The subspace construction and chosen basis reliably separate equivariance from invariance for arbitrary geometric transformations and network architectures.

What would settle it

Running SEIS on a group-equivariant convolutional network known to maintain perfect equivariance across all layers and observing low equivariance scores would falsify the metric's separation claim.

read the original abstract

Understanding how neural representations respond to geometric transformations is essential for evaluating whether learned features preserve meaningful spatial structure. Existing approaches primarily assess robustness primarily by comparing model outputs under transformed inputs, offering limited insight into how geometric information is organized within internal representations and failing to distinguish between information loss and re-encoding. In this work, we introduce SEIS (Subspace-based Equivariance and Invariance Scores), a subspace metric for analyzing layer-wise feature representations under geometric transformations, disentangling equivariance from invariance without requiring labels or explicit knowledge of the transformation. Through controlled experiments across diverse architectures, we uncover several consistent patterns. First, convolutional encoders exhibit a depth-wise transition from strong equivariance to increasing invariance, with both properties stabilizing within the first few training epochs. In segmentation decoders, however, equivariance tends to recover in later layers. Second, this trade-off is not intrinsic but is shaped by training decisions: data augmentation actively strengthens both equivariance and invariance simultaneously, and multi-task learning induces synergistic gains in both properties beyond what either task achieves alone. Extending our analysis beyond convolutional networks, we find that transformer-based models exhibit distinct geometric behaviors, while MLP-Mixers display intermediate characteristics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces SEIS (Subspace-based Equivariance and Invariance Scores), a metric that extracts subspaces from layer activations under geometric transformations to separately quantify equivariance and invariance in neural representations. It applies SEIS to CNN encoders/decoders, transformers, and MLP-Mixers, reporting a depth-wise shift from equivariance to invariance (stabilizing early in training), equivariance recovery in segmentation decoders, simultaneous strengthening of both properties under data augmentation, synergistic gains from multi-task learning, and architecture-specific geometric behaviors.

Significance. If the subspace separation is robust, SEIS provides a label-free, transformation-agnostic tool for dissecting internal geometric structure that output-based robustness metrics cannot access. The reported training and architecture patterns could inform design choices for spatially structured tasks, though the significance hinges on whether the scores reflect intrinsic representation geometry rather than subspace-construction artifacts.

major comments (2)
  1. [§3.2] §3.2 (SEIS definition): the equivariance and invariance scores are obtained by projecting activations onto subspaces derived from SVD of stacked transformed features; the paper does not specify a data-independent rule for selecting subspace dimension k or basis rank, so the claimed disentanglement may vary with this choice and with data distribution, directly affecting the depth-wise transition and augmentation results.
  2. [§4.1–4.3] §4.1–4.3 (experiments): the reported patterns (e.g., early stabilization within first few epochs, synergistic multi-task gains) are shown only for the chosen architectures and transformations; no ablation varies the subspace rank or basis construction method, leaving open whether the CNN-to-transformer differences and decoder recovery are stable under alternative subspace selections.
minor comments (2)
  1. [Figure 2] Figure 2 caption and §4.2 text: the y-axis scaling for invariance scores is not stated explicitly, making it hard to compare absolute values across architectures.
  2. [§2] §2 (related work): the discussion of prior equivariance metrics omits recent subspace-based approaches in representation geometry; adding 2–3 key citations would clarify novelty.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. The feedback highlights important aspects of methodological clarity and experimental robustness that we will address in the revision. Below we respond point by point to the major comments.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (SEIS definition): the equivariance and invariance scores are obtained by projecting activations onto subspaces derived from SVD of stacked transformed features; the paper does not specify a data-independent rule for selecting subspace dimension k or basis rank, so the claimed disentanglement may vary with this choice and with data distribution, directly affecting the depth-wise transition and augmentation results.

    Authors: We agree that an explicit, reproducible rule for selecting the subspace dimension k is necessary. The original manuscript leaves this choice implicit. In the revised version we will define k as the minimal dimension retaining at least 95% of the cumulative explained variance from the SVD of the stacked transformed activations. This is a standard, data-driven threshold that remains consistent across datasets and architectures. We will also add a sensitivity analysis (new paragraph in §3.2 and supplementary figures) showing that the reported depth-wise transition, early stabilization, and augmentation effects remain qualitatively unchanged for thresholds between 90% and 99%. revision: yes

  2. Referee: [§4.1–4.3] §4.1–4.3 (experiments): the reported patterns (e.g., early stabilization within first few epochs, synergistic multi-task gains) are shown only for the chosen architectures and transformations; no ablation varies the subspace rank or basis construction method, leaving open whether the CNN-to-transformer differences and decoder recovery are stable under alternative subspace selections.

    Authors: We acknowledge that the current experiments do not include ablations on subspace rank or basis construction. In the revision we will add a dedicated ablation subsection to §4 that (i) varies k over a wide range for the CNN encoder, transformer, and decoder settings and (ii) compares the SVD-based stacked-feature basis against PCA on individual transformations and random projections. Results will be reported for the key patterns (early stabilization, multi-task synergy, decoder recovery, and architecture differences) to demonstrate stability. These additions will appear both in the main text and supplementary tables. revision: yes

Circularity Check

0 steps flagged

SEIS introduced as direct subspace definition with no reduction to inputs or self-citations

full rationale

The paper defines SEIS explicitly as a subspace metric computed from layer activations under geometric transformations, with the central claim being its ability to disentangle equivariance from invariance by construction of the subspaces. No equations or steps in the provided description reduce a claimed prediction or result back to fitted parameters, self-citations, or ansatzes that presuppose the outcome. The method is presented as an analysis tool rather than a derived theorem, and the abstract emphasizes independence from labels or explicit transformation knowledge without invoking prior self-referential results. This makes the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no concrete free parameters, axioms, or invented entities; full text would be required to populate the ledger.

pith-pipeline@v0.9.0 · 5514 in / 1056 out tokens · 35507 ms · 2026-05-16T07:36:20.171819+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages · 1 internal anchor

  1. [1]

    INTRODUCTION The remarkable success of deep learning across computer vi- sion, physics simulation, and molecular biology is largely attributed to the effective utilization of inductive biases [1]. Convolutional Neural Networks (CNNs) integrate translation equivariance to handle spatial data efficiently [2], while re- cent advances in Group equivariant CNN...

  2. [2]

    RELA TED WORK Quantifying the response of neural networks to geometric transformations is crucial for evaluating whether learned representations behave in a structured manner. Although specialized architectures such as G-CNNs [3] theoretically enforce equivariance, empirical measurement remains es- sential, as these guarantees often degrade in practice du...

  3. [3]

    Furthermore, modern architectures process these features via linear convolutions and element-wise non- linearities that preserve local topology

    METHOD Rationale.While the theoretical definition of equivariance permits arbitrary non-linear transformations, spatial trans- formations on grid-structured data act as linear operators on the feature space. Furthermore, modern architectures process these features via linear convolutions and element-wise non- linearities that preserve local topology. Cons...

  4. [4]

    Validation on Synthetic Transformations Experimental Setup.We validate SEIS using a controlled synthetic setup that isolates geometric effects from training dynamics

    EXPERIMENTS 4.1. Validation on Synthetic Transformations Experimental Setup.We validate SEIS using a controlled synthetic setup that isolates geometric effects from training dynamics. Activations are extracted from a single convolu- tional layer on MNIST, and paired representations are con- structed by directly applying spatial transformations to the refe...

  5. [5]

    pared to≈0.3in the non-augmented model

    Augmentation increases invariance while preserving equivariance in deeper layers. pared to≈0.3in the non-augmented model. These results in- dicate that affine augmentation promotes representations that are more transformation-tolerant while preserving equivari- ant structure, consistent with empirical observations based on Pearson correlation measures [8]...

  6. [6]

    CONCLUSIONS We presented SEIS, a subspace-based metric for dissecting the geometric behavior of neural representations by disentan- gling equivariance from invariance. Our experiments show that the equivariance–invariance trade-off is not an inherent property of network architecture but is actively shaped by training choices, including augmentation, task ...

  7. [7]

    Geometric deep learning: going beyond euclidean data,

    Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst, “Geometric deep learning: going beyond euclidean data,”IEEE Signal Processing Magazine, vol. 34, no. 4, pp. 18–42, 2017

  8. [8]

    Backpropagation applied to hand- written zip code recognition,

    Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne Hubbard, and Lawrence D Jackel, “Backpropagation applied to hand- written zip code recognition,”Neural Computation, vol. 1, no. 4, pp. 541–551, 1989

  9. [9]

    Group equivariant con- volutional networks,

    Taco Cohen and Max Welling, “Group equivariant con- volutional networks,” inInternational conference on machine learning. PMLR, 2016, pp. 2990–2999

  10. [10]

    General E(2)- equivariant steerable CNNs,

    Maurice Weiler and Gabriele Cesa, “General E(2)- equivariant steerable CNNs,”Advances in Neural In- formation Processing Systems, vol. 32, 2019

  11. [11]

    Why do deep convo- lutional networks generalize so poorly to small image transformations?,

    Aharon Azulay and Yair Weiss, “Why do deep convo- lutional networks generalize so poorly to small image transformations?,”Journal of Machine Learning Re- search, vol. 20, no. 184, pp. 1–25, 2019

  12. [12]

    Measuring invariances in deep net- works,

    Ian Goodfellow, Honglak Lee, Quoc Le, Andrew Saxe, and Andrew Ng, “Measuring invariances in deep net- works,”Advances in Neural Information Processing Systems, vol. 22, 2009

  13. [13]

    In what ways are deep neural networks invariant and how should we measure this?,

    Henry Kvinge, Tegan Emerson, Grayson Jorgenson, Scott Vasquez, Tim Doster, and Jesse Lew, “In what ways are deep neural networks invariant and how should we measure this?,” inAdvances in Neural Information Processing Systems, 2022, vol. 35, pp. 32816–32829

  14. [14]

    What affects learned equivariance in deep image recognition models?,

    Robert-Jan Bruintjes, Tomasz Motyka, and Jan van Gemert, “What affects learned equivariance in deep image recognition models?,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4839–4847

  15. [15]

    In- variance measures for neural networks,

    Facundo Manuel Quiroga, Jordina Torrents-Barrena, Laura Cristina Lanzarini, and Domenec Puig-Valls, “In- variance measures for neural networks,”Applied Soft Computing, vol. 132, pp. 109817, 2023

  16. [16]

    Svcca: Singular vector canoni- cal correlation analysis for deep learning dynamics and interpretability,

    Maithra Raghu, Justin Gilmer, Jason Yosinski, and Jascha Sohl-Dickstein, “Svcca: Singular vector canoni- cal correlation analysis for deep learning dynamics and interpretability,”Advances in Neural Information Pro- cessing Systems, vol. 30, 2017

  17. [17]

    Similarity of neural network rep- resentations revisited,

    Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton, “Similarity of neural network rep- resentations revisited,” inInternational Conference on Machine Learning. PMlR, 2019, pp. 3519–3529

  18. [18]

    Learning invariances in neural net- works from training data,

    Gregory Benton, Marc Finzi, Pavel Izmailov, and An- drew G Wilson, “Learning invariances in neural net- works from training data,”Advances in Neural Infor- mation Processing Systems, vol. 33, pp. 17605–17616, 2020

  19. [19]

    Benchmarking neural network robustness to common corruptions and perturbations,

    Dan Hendrycks and Thomas Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,”International Conference on Learning Representations, 2019

  20. [20]

    On the strong correlation between model invariance and generalization,

    Weijian Deng, Stephen Gould, and Liang Zheng, “On the strong correlation between model invariance and generalization,”Advances in Neural Information Pro- cessing Systems, vol. 35, pp. 28052–28067, 2022

  21. [21]

    Shortcut learning in deep neural networks,

    Robert Geirhos, J ¨orn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A Wichmann, “Shortcut learning in deep neural networks,”Nature Machine Intelligence, vol. 2, no. 11, pp. 665–673, 2020

  22. [22]

    Understanding im- age representations by measuring their equivariance and equivalence,

    Karel Lenc and Andrea Vedaldi, “Understanding im- age representations by measuring their equivariance and equivalence,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 991–999

  23. [23]

    Spatial transformer networks,

    Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al., “Spatial transformer networks,”Advances in Neu- ral Information Processing Systems, vol. 28, 2015

  24. [24]

    Dynamic routing between capsules,

    Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton, “Dynamic routing between capsules,”Advances in Neu- ral Information Processing Systems, vol. 30, 2017

  25. [25]

    The Lie derivative for measur- ing learned equivariance,

    Nate Gruver, Marc Finzi, Micah Goldblum, and An- drew Gordon Wilson, “The Lie derivative for measur- ing learned equivariance,”International Conference on Learning Representations, 2022

  26. [26]

    Grounding representation similarity through sta- tistical testing,

    Frances Ding, Jean-Stanislas Denain, and Jacob Stein- hardt, “Grounding representation similarity through sta- tistical testing,”Advances in Neural Information Pro- cessing Systems, vol. 34, pp. 1556–1568, 2021

  27. [27]

    U-net: Convolutional networks for biomedical image segmentation,

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “U-net: Convolutional networks for biomedical image segmentation,” inInternational Conference on Medical Image Computing and Computer-assisted Intervention. Springer, 2015, pp. 234–241

  28. [28]

    Multi-task learning for dense prediction tasks: A survey,

    Simon Vandenhende, Stamatios Georgoulis, Wouter Van Gansbeke, Marc Proesmans, Dengxin Dai, and Luc Van Gool, “Multi-task learning for dense prediction tasks: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3614–3633, 2021