Learning Affine-Equivariant Proximal Operators
Pith reviewed 2026-05-10 11:00 UTC · model grok-4.3
The pith
Neural networks can compute exact proximal operators while remaining equivariant to shifts and scaling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We show how to obtain learned functions parametrized by neural networks that provably compute exact proximal operators while being equivariant to shifts and scaling, which we dub Affine-Equivariant Learned Proximal Networks (AE-LPNs). We demonstrate our results on synthetic, constructive examples, and then on real data via denoising in out-of-distribution settings. Our equivariant learned proximals enhance robustness to noise distributions and affine shifts far beyond training distributions, improving the practical utility of learned proximal operators.
What carries the argument
Affine-Equivariant Learned Proximal Networks (AE-LPNs), neural-network functions that enforce both the exact proximal-operator definition for a learned regularizer and equivariance under affine transformations.
If this is right
- The learned proximal operators can be inserted into standard proximal algorithms for inverse problems while preserving exactness.
- Denoising and other reconstruction tasks gain robustness when the test inputs differ in shift or scale from the training set.
- Equivariance can be added to learned regularizers without forcing the regularizer itself to be convex or analytically simple.
- The same construction supplies a template for imposing other structural constraints on learned proximal operators.
Where Pith is reading between the lines
- The approach could be extended to other transformation groups, such as rotations or reflections, by modifying the equivariance constraints.
- Because the networks remain exact proximal operators, they can be used inside larger optimization loops that require the proximal property for convergence guarantees.
- Training data requirements might decrease if equivariance is enforced by architecture rather than learned from augmented examples.
Load-bearing premise
A neural-network parametrization can be found that satisfies the exact proximal-operator condition for some data-driven regularizer at the same time as the affine-equivariance constraints.
What would settle it
An input point where the network output violates either the proximal fixed-point relation for the implied regularizer or the equivariance relation under a shift or scale change.
Figures
read the original abstract
Proximal operators are fundamental across many applications in signal processing and machine learning, including solving ill-posed inverse problems. Recent work has introduced Learned Proximal Networks (LPNs), providing parametric functions that compute exact proximals for data-driven and potentially non-convex regularizers. However, in many settings it is important to include additional structure to these regularizers--and their corresponding proximals--such as shift and scale equivariance. In this work, we show how to obtain learned functions parametrized by neural networks that provably compute exact proximal operators while being equivariant to shifts and scaling, which we dub Affine-Equivariant Learned Proximal Networks (AE-LPNs). We demonstrate our results on synthetic, constructive examples, and then on real data via denoising in out-of-distribution settings. Our equivariant learned proximals enhance robustness to noise distributions and affine shifts far beyond training distributions, improving the practical utility of learned proximal operators
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Affine-Equivariant Learned Proximal Networks (AE-LPNs), neural network-parametrized functions that provably compute exact proximal operators for data-driven (potentially non-convex) regularizers while satisfying equivariance to shifts and scalings. It provides constructive synthetic examples to illustrate the theory and applies the approach to real-data denoising, reporting improved robustness to noise distributions and affine shifts outside the training distribution.
Significance. If the central claims hold, the work extends prior Learned Proximal Network results by incorporating affine equivariance as an explicit structural constraint on both the regularizer and its proximal map. The provision of synthetic constructive examples is a clear strength, as it permits direct verification of the exactness and equivariance properties. The emphasis on out-of-distribution robustness in the denoising experiments also addresses a practical limitation of purely data-driven proximal methods.
major comments (2)
- [Abstract and theoretical construction] Abstract and theoretical construction: the claim that AE-LPNs simultaneously achieve exact proximal-operator semantics for a data-driven regularizer R and affine equivariance must be shown not to force R to be invariant under the same group actions. The subdifferential relation x - f(x) ∈ ∂R(f(x)) implies that f(Tx) = T f(x) for affine T typically requires R(Ty) = R(y); if the neural parametrization enforces this via equivariant layers or parameter tying, it restricts admissible R even for non-convex cases, undermining the stated generality for arbitrary data-driven regularizers.
- [Synthetic examples section] Synthetic examples section: the constructive examples should explicitly state whether the generated regularizers are invariant under the affine group or whether non-invariant R can still be represented exactly by an AE-LPN; if only the invariant subclass works, this directly tests the weakest assumption that both exactness and equivariance can coexist without one property limiting the other.
minor comments (2)
- [Introduction] Notation for the affine group actions (shifts and scalings) could be introduced more explicitly when first defining equivariance, to avoid ambiguity between the proximal map and the regularizer.
- [Real-data experiments] The denoising experiments would benefit from a brief statement of the precise affine transformations applied at test time and how they differ from the training distribution.
Simulated Author's Rebuttal
We thank the referee for the insightful comments on our manuscript. We address the major comments point by point below, providing clarifications on the theoretical construction and agreeing to enhance the synthetic examples section for clarity. These revisions will strengthen the presentation of our results.
read point-by-point responses
-
Referee: [Abstract and theoretical construction] Abstract and theoretical construction: the claim that AE-LPNs simultaneously achieve exact proximal-operator semantics for a data-driven regularizer R and affine equivariance must be shown not to force R to be invariant under the same group actions. The subdifferential relation x - f(x) ∈ ∂R(f(x)) implies that f(Tx) = T f(x) for affine T typically requires R(Ty) = R(y); if the neural parametrization enforces this via equivariant layers or parameter tying, it restricts admissible R even for non-convex cases, undermining the stated generality for arbitrary data-driven regularizers.
Authors: We thank the referee for highlighting this important theoretical nuance. Indeed, the affine equivariance of the proximal operator f does imply that the corresponding regularizer R must possess compatible invariance properties under the group actions (for instance, shift-invariance for translations, and appropriate scaling behavior for scalings). This is a direct consequence of the subdifferential characterization. Our construction does not claim to represent completely arbitrary regularizers without any structural constraints; rather, AE-LPNs enable the learning of data-driven regularizers that are equipped with affine equivariance in their proximal operators. The 'data-driven' aspect allows the specific form of R (including non-convexity) to be determined from data, while the equivariant neural parametrization ensures the proximal map respects the desired symmetries. This is analogous to how equivariant networks are used in other domains to incorporate inductive biases without losing expressivity within the constrained function class. We will revise the abstract and the theoretical construction section to explicitly discuss this relationship between the equivariance of f and the properties of R, making clear that the generality is within the class of regularizers admitting affine-equivariant proximals. revision: yes
-
Referee: [Synthetic examples section] Synthetic examples section: the constructive examples should explicitly state whether the generated regularizers are invariant under the affine group or whether non-invariant R can still be represented exactly by an AE-LPN; if only the invariant subclass works, this directly tests the weakest assumption that both exactness and equivariance can coexist without one property limiting the other.
Authors: We agree with the referee that the synthetic examples would be improved by an explicit discussion of this point. In the constructive examples, the regularizers are generated to be invariant under the affine group actions, which ensures that their proximal operators can be exactly equivariant while satisfying the proximal operator properties. Non-invariant regularizers generally cannot have exactly affine-equivariant proximal operators, as this would violate the subdifferential relation for general transformations. This does not limit the practical utility, as the invariant class still allows for rich, data-driven, and non-convex regularizers learned via the AE-LPN parametrization. We will revise the synthetic examples section to explicitly state that the examples use invariant regularizers and to briefly explain why this is necessary for the coexistence of exactness and equivariance. revision: yes
Circularity Check
No circularity: derivation extends LPNs via independent architectural constraints
full rationale
The paper's central claim is that neural networks can be parametrized to compute exact proximal operators for data-driven regularizers while enforcing affine equivariance. No equations, definitions, or self-citations in the abstract or visible text reduce this result to a tautology, fitted input, or self-referential premise. The extension from prior LPN work is presented as adding structural constraints (shift/scale equivariance) whose compatibility with exact prox computation is demonstrated on synthetic and real data, without the result being forced by construction or by load-bearing self-citation. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Proximal operators are defined via the argmin of a regularized objective and satisfy standard properties such as being firmly nonexpansive.
invented entities (1)
-
Affine-Equivariant Learned Proximal Network (AE-LPN)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
ISSN 0162-8828. doi: 10.1109/TPAMI.2010.161. URLhttp://dx.doi.org/10.1109/ TPAMI.2010.161. Martin Benning and Martin Burger. Modern regularization methods for inverse problems, 2018. URLhttps://arxiv.org/abs/1801.09922. Alberto Bietti and Julien Mairal. Group invariance, stability to deformations, and complexity of deep convolutional representations.Journ...
-
[2]
ISSN 2167-3888. doi: 10.1561/2400000003. URLhttps://doi.org/10.1561/2400000003. Yaniv Romano, Michael Elad, and Peyman Milanfar. The little engine that could: Regularization by denoising (red), 2017. URLhttps://arxiv.org/abs/1611.02862. Singanallur V. Venkatakrishnan, Charles A. Bouman, and Brendt Wohlberg. Plug-and-play priors for model based reconstruct...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.