Achieving Approximate Symmetry Is Exponentially Easier than Exact Symmetry
Pith reviewed 2026-05-17 01:11 UTC · model grok-4.3
The pith
Approximate symmetry can be enforced with only logarithmic averaging complexity while exact symmetry requires linear complexity in the group size.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that averaging complexity exhibits an exponential separation: exact symmetry requires linear complexity in the size of the symmetry group, whereas approximate symmetry can be achieved with only logarithmic complexity under standard conditions.
What carries the argument
Averaging complexity, a framework that quantifies the cost of enforcing symmetry through averaging over group actions.
If this is right
- Models incorporating approximate symmetries can scale to larger symmetry groups without prohibitive cost.
- Approximate symmetry provides a practical trade-off that retains most inductive bias benefits at reduced overhead.
- The separation supplies a formal reason to prefer approximate symmetries in robustness-critical applications.
- Averaging complexity becomes a tool for choosing the degree of symmetry based on available compute.
Where Pith is reading between the lines
- Designers could tune symmetry precision dynamically to match hardware budgets in large models.
- The same logarithmic-versus-linear gap might appear when symmetry is measured by other natural complexity notions.
- Scientific domains with high-dimensional symmetries could adopt graded approximations to reach previously intractable scales.
Load-bearing premise
The exponential separation holds under the paper's unspecified standard conditions and that averaging complexity is the right measure of enforcement cost.
What would settle it
A concrete construction in which approximate symmetry still requires linear averaging complexity or in which exact symmetry can be achieved with logarithmic complexity under the same framework.
Figures
read the original abstract
Enforcing exact symmetry in machine learning models often yields significant gains in scientific applications, serving as a powerful inductive bias. However, recent work suggests that relying on approximate symmetry can offer greater flexibility and robustness. Despite promising empirical evidence, there has been little theoretical understanding, and in particular, a direct comparison between exact and approximate symmetry is missing from the literature. In this paper, we initiate this study by asking: What is the cost of enforcing exact versus approximate symmetry? To address this question, we introduce averaging complexity, a framework for quantifying the cost of enforcing symmetry via averaging. Our main result is an exponential separation: under standard conditions, exact symmetry requires linear averaging complexity, whereas approximate symmetry can be attained with only logarithmic complexity in the group size. To the best of our knowledge, this provides the first theoretical separation of these two cases, formally justifying why approximate symmetry may be preferable in practice. Beyond this, our tools and techniques may be of independent interest for the broader study of symmetries in machine learning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces averaging complexity as a new framework to measure the cost of enforcing symmetry in machine learning models through averaging operations. Its central claim is an exponential separation: under standard conditions, exact symmetry requires linear averaging complexity in the group size, while approximate symmetry can be achieved with only logarithmic complexity. This is positioned as the first theoretical result directly comparing the two and justifying the practical preference for approximate symmetry.
Significance. If the separation holds under conditions that cover common ML symmetry groups (e.g., permutations, rotations) and if averaging complexity proves to be a faithful proxy for practical enforcement cost, the result would supply useful theoretical support for recent empirical trends favoring approximate symmetries. The introduction of a new complexity measure and associated proof techniques could also be reusable for other symmetry-related questions in the field.
major comments (3)
- [Abstract and §1] Abstract and §1: The exponential separation is stated to hold 'under standard conditions,' yet these conditions are not explicitly enumerated. The manuscript must define them precisely (group representation, action type, model class, or averaging operator) so that it is clear whether they encompass typical ML settings such as equivariant networks on point clouds or images.
- [Definition of averaging complexity (likely §2 or §3)] Definition of averaging complexity (likely §2 or §3): The paper must justify why averaging complexity is the appropriate cost metric rather than alternatives such as parameter count, gradient computation cost, or direct invariance error. Without this justification or a comparison showing that other enforcement methods cannot bypass the linear cost for exact symmetry, the claimed separation may not transfer to practice.
- [Main theorem (likely §4)] Main theorem (likely §4): The proof of the linear lower bound for exact symmetry and the logarithmic upper bound for approximate symmetry must be checked against the precise statement of the standard conditions; if those conditions turn out to be restrictive (e.g., only abelian groups or specific averaging operators), the result's scope is narrower than the abstract suggests.
minor comments (2)
- [Related work] Clarify the relationship between averaging complexity and existing notions such as group-equivariant network depth or orbit-stabilizer costs.
- [Main result] Add a short table or example illustrating the complexity numbers for a concrete small group (e.g., cyclic group of order 8) to make the exponential gap intuitive.
Simulated Author's Rebuttal
We thank the referee for their insightful comments on our paper. We provide point-by-point responses to the major comments below. We agree that clarifications are needed and will revise the manuscript to address these points.
read point-by-point responses
-
Referee: [Abstract and §1] Abstract and §1: The exponential separation is stated to hold 'under standard conditions,' yet these conditions are not explicitly enumerated. The manuscript must define them precisely (group representation, action type, model class, or averaging operator) so that it is clear whether they encompass typical ML settings such as equivariant networks on point clouds or images.
Authors: We agree with this observation. In the revised manuscript, we will explicitly define the standard conditions in a new paragraph in Section 1 and detail them in Section 2. These conditions include: finite groups with unitary representations, transitive actions on the input space, and averaging operators that compute the group average of the model output. This setup directly applies to common ML scenarios, including permutation-equivariant networks for point clouds and rotation-equivariant networks for images. revision: yes
-
Referee: [Definition of averaging complexity (likely §2 or §3)] Definition of averaging complexity (likely §2 or §3): The paper must justify why averaging complexity is the appropriate cost metric rather than alternatives such as parameter count, gradient computation cost, or direct invariance error. Without this justification or a comparison showing that other enforcement methods cannot bypass the linear cost for exact symmetry, the claimed separation may not transfer to practice.
Authors: We will strengthen the justification for averaging complexity in the revision. Averaging complexity measures the number of group elements that need to be averaged to enforce symmetry, which is a direct proxy for the computational overhead in averaging-based symmetry enforcement. We will add a discussion comparing it to parameter count (which does not account for inference-time costs) and invariance error (which is the objective rather than the cost). However, we note that our result is specific to averaging methods; we will clarify that other methods like architectural constraints may have different costs, but the separation highlights the advantage within the averaging paradigm. revision: partial
-
Referee: [Main theorem (likely §4)] Main theorem (likely §4): The proof of the linear lower bound for exact symmetry and the logarithmic upper bound for approximate symmetry must be checked against the precise statement of the standard conditions; if those conditions turn out to be restrictive (e.g., only abelian groups or specific averaging operators), the result's scope is narrower than the abstract suggests.
Authors: The standard conditions in our theorem apply to general finite groups, including non-abelian ones such as the permutation group and rotation groups. The lower bound proof uses a counting argument on the number of distinct orbits that holds for any group action satisfying the conditions, and the upper bound uses random sampling from the group, which works for any finite group. We will add a corollary or remark verifying that the result applies to standard ML groups like S_n for permutations and SO(3) for 3D rotations, confirming the scope is as broad as stated in the abstract. revision: yes
Circularity Check
No circularity: new framework yields independent separation result
full rationale
The paper defines averaging complexity as a new measure for the cost of symmetry enforcement via averaging, then proves an exponential separation (linear for exact symmetry, logarithmic for approximate) under stated conditions. This chain does not reduce any claim to a self-definitional loop, a fitted parameter renamed as prediction, or a load-bearing self-citation. The framework is introduced explicitly before the result, and the separation follows from the definitions and proof steps rather than being equivalent to the inputs by construction. The analysis is therefore self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard assumptions from group theory and representation theory for symmetry groups acting on ML models
invented entities (1)
-
averaging complexity
no independent evidence
Forward citations
Cited by 1 Pith paper
-
Approximate Label Symmetries Improve Data Scaling
Approximate label symmetries improve scaling laws for ML models of hydrogen orbital densities, water vibrational modes, and 3D potential energy surfaces, with a Hessian correction for approximate cases.
Reference graph
Works this paper leans on
-
[1]
1, 3 Dominik S Kufel, Jack Kemp, DinhDuy Vu, Simon M Linsel, Chris R Laumann, and Norman Y Yao. Approximately symmetric neural networks for quantum spin liquids.Physical Review Letters, 135 (5):056702, 2025. 1, 3 Yi-Lun Liao and Tess Smidt. Equiformer: Equivariant graph attention transformer for 3d atomistic graphs. InInt. Conference on Learning Represent...
work page 2025
-
[2]
On the Burnside-Brauer-Steinberg theorem
3 Omri Puny, Matan Atzmon, Edward J Smith, Ishan Misra, Aditya Grover, Heli Ben-Hamu, and Yaron Lipman. Frame averaging for invariant and equivariant network design. InInt. Conference on Learning Representations (ICLR), 2022. 1, 3 David W Romero and Suhas Lohit. Learning partial equivariances from data. InAdvances in Neural Information Processing Systems ...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[3]
24 Robert Steinberg. Complete sets of representations of algebras.Proceedings of the American Math- ematical Society, 13(5):746–747, 1962. 29 Behrooz Tahmasebi and Stefanie Jegelka. The exact sample complexity gain from invariances for kernel regression. InAdvances in Neural Information Processing Systems (NeurIPS), 2023. 3 Behrooz Tahmasebi and Stefanie ...
work page 1962
-
[4]
1 Joel A Tropp. User-friendly tail bounds for sums of random matrices.Foundations of computational mathematics, 12(4):389–434, 2012. 28 Tycho van der Ouderaa, David W Romero, and Mark van der Wilk. Relaxing equivariance con- straints with non-stationary continuous filters. InAdvances in Neural Information Processing Systems (NeurIPS), 2022. 1, 3 Tycho van...
-
[5]
1, 3 13 Preprint A BACKGROUND FORPROOFS This appendix collects the background used in our proofs. We briefly review finite groups, group actions, group representations, character theory, and Fourier analysis on finite groups (Serre et al., 1977; Isaacs, 1994; Fulton & Harris, 2013). A.1 GROUPTHEORY Afinite groupis a finite setGequipped with a binary opera...
work page 1977
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.