Learning Affine-Equivariant Proximal Operators

Jeremias Sulam; Oriel Savir; Zhenghan Fang

arxiv: 2604.15556 · v1 · submitted 2026-04-16 · 💻 cs.LG · cs.CV

Learning Affine-Equivariant Proximal Operators

Oriel Savir , Zhenghan Fang , Jeremias Sulam This is my paper

Pith reviewed 2026-05-10 11:00 UTC · model grok-4.3

classification 💻 cs.LG cs.CV

keywords proximal operatorsequivariant neural networkslearned regularizersdenoisinginverse problemsaffine equivariancemachine learning

0 comments

The pith

Neural networks can compute exact proximal operators while remaining equivariant to shifts and scaling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how neural networks can be parametrized to compute the proximal operator of a data-driven regularizer exactly, while also obeying equivariance under shifts and scalings. This structure matters because proximal operators appear in many inverse problems, and built-in equivariance can improve behavior when inputs vary in position or magnitude from the training examples. The authors prove that their construction satisfies both the proximal definition and the equivariance property, then test it on synthetic cases and on denoising tasks where the noise or shifts lie outside the training distribution.

Core claim

We show how to obtain learned functions parametrized by neural networks that provably compute exact proximal operators while being equivariant to shifts and scaling, which we dub Affine-Equivariant Learned Proximal Networks (AE-LPNs). We demonstrate our results on synthetic, constructive examples, and then on real data via denoising in out-of-distribution settings. Our equivariant learned proximals enhance robustness to noise distributions and affine shifts far beyond training distributions, improving the practical utility of learned proximal operators.

What carries the argument

Affine-Equivariant Learned Proximal Networks (AE-LPNs), neural-network functions that enforce both the exact proximal-operator definition for a learned regularizer and equivariance under affine transformations.

If this is right

The learned proximal operators can be inserted into standard proximal algorithms for inverse problems while preserving exactness.
Denoising and other reconstruction tasks gain robustness when the test inputs differ in shift or scale from the training set.
Equivariance can be added to learned regularizers without forcing the regularizer itself to be convex or analytically simple.
The same construction supplies a template for imposing other structural constraints on learned proximal operators.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be extended to other transformation groups, such as rotations or reflections, by modifying the equivariance constraints.
Because the networks remain exact proximal operators, they can be used inside larger optimization loops that require the proximal property for convergence guarantees.
Training data requirements might decrease if equivariance is enforced by architecture rather than learned from augmented examples.

Load-bearing premise

A neural-network parametrization can be found that satisfies the exact proximal-operator condition for some data-driven regularizer at the same time as the affine-equivariance constraints.

What would settle it

An input point where the network output violates either the proximal fixed-point relation for the implied regularizer or the equivariance relation under a shift or scale change.

Figures

Figures reproduced from arXiv: 2604.15556 by Jeremias Sulam, Oriel Savir, Zhenghan Fang.

**Figure 2.** Figure 2: PSNR of different methods (see description in text) for denoising across varying noise [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Example results on BSDS500 across noise levels. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Results validating equivariance on BSDS500 under the transformation [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Proximal operators are fundamental across many applications in signal processing and machine learning, including solving ill-posed inverse problems. Recent work has introduced Learned Proximal Networks (LPNs), providing parametric functions that compute exact proximals for data-driven and potentially non-convex regularizers. However, in many settings it is important to include additional structure to these regularizers--and their corresponding proximals--such as shift and scale equivariance. In this work, we show how to obtain learned functions parametrized by neural networks that provably compute exact proximal operators while being equivariant to shifts and scaling, which we dub Affine-Equivariant Learned Proximal Networks (AE-LPNs). We demonstrate our results on synthetic, constructive examples, and then on real data via denoising in out-of-distribution settings. Our equivariant learned proximals enhance robustness to noise distributions and affine shifts far beyond training distributions, improving the practical utility of learned proximal operators

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper constructs neural nets that are provably exact proximal operators while affine-equivariant, extending LPNs with a symmetry constraint that improves OOD robustness in denoising but likely restricts the regularizer to invariant cases.

read the letter

The core contribution is a parametrization of neural networks that compute exact proximal operators for some regularizer while satisfying shift and scale equivariance. They call these AE-LPNs and show the construction works on synthetic examples before moving to real denoising tasks where the learned operators handle noise and affine shifts outside the training distribution better than baselines.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Affine-Equivariant Learned Proximal Networks (AE-LPNs), neural network-parametrized functions that provably compute exact proximal operators for data-driven (potentially non-convex) regularizers while satisfying equivariance to shifts and scalings. It provides constructive synthetic examples to illustrate the theory and applies the approach to real-data denoising, reporting improved robustness to noise distributions and affine shifts outside the training distribution.

Significance. If the central claims hold, the work extends prior Learned Proximal Network results by incorporating affine equivariance as an explicit structural constraint on both the regularizer and its proximal map. The provision of synthetic constructive examples is a clear strength, as it permits direct verification of the exactness and equivariance properties. The emphasis on out-of-distribution robustness in the denoising experiments also addresses a practical limitation of purely data-driven proximal methods.

major comments (2)

[Abstract and theoretical construction] Abstract and theoretical construction: the claim that AE-LPNs simultaneously achieve exact proximal-operator semantics for a data-driven regularizer R and affine equivariance must be shown not to force R to be invariant under the same group actions. The subdifferential relation x - f(x) ∈ ∂R(f(x)) implies that f(Tx) = T f(x) for affine T typically requires R(Ty) = R(y); if the neural parametrization enforces this via equivariant layers or parameter tying, it restricts admissible R even for non-convex cases, undermining the stated generality for arbitrary data-driven regularizers.
[Synthetic examples section] Synthetic examples section: the constructive examples should explicitly state whether the generated regularizers are invariant under the affine group or whether non-invariant R can still be represented exactly by an AE-LPN; if only the invariant subclass works, this directly tests the weakest assumption that both exactness and equivariance can coexist without one property limiting the other.

minor comments (2)

[Introduction] Notation for the affine group actions (shifts and scalings) could be introduced more explicitly when first defining equivariance, to avoid ambiguity between the proximal map and the regularizer.
[Real-data experiments] The denoising experiments would benefit from a brief statement of the precise affine transformations applied at test time and how they differ from the training distribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the insightful comments on our manuscript. We address the major comments point by point below, providing clarifications on the theoretical construction and agreeing to enhance the synthetic examples section for clarity. These revisions will strengthen the presentation of our results.

read point-by-point responses

Referee: [Abstract and theoretical construction] Abstract and theoretical construction: the claim that AE-LPNs simultaneously achieve exact proximal-operator semantics for a data-driven regularizer R and affine equivariance must be shown not to force R to be invariant under the same group actions. The subdifferential relation x - f(x) ∈ ∂R(f(x)) implies that f(Tx) = T f(x) for affine T typically requires R(Ty) = R(y); if the neural parametrization enforces this via equivariant layers or parameter tying, it restricts admissible R even for non-convex cases, undermining the stated generality for arbitrary data-driven regularizers.

Authors: We thank the referee for highlighting this important theoretical nuance. Indeed, the affine equivariance of the proximal operator f does imply that the corresponding regularizer R must possess compatible invariance properties under the group actions (for instance, shift-invariance for translations, and appropriate scaling behavior for scalings). This is a direct consequence of the subdifferential characterization. Our construction does not claim to represent completely arbitrary regularizers without any structural constraints; rather, AE-LPNs enable the learning of data-driven regularizers that are equipped with affine equivariance in their proximal operators. The 'data-driven' aspect allows the specific form of R (including non-convexity) to be determined from data, while the equivariant neural parametrization ensures the proximal map respects the desired symmetries. This is analogous to how equivariant networks are used in other domains to incorporate inductive biases without losing expressivity within the constrained function class. We will revise the abstract and the theoretical construction section to explicitly discuss this relationship between the equivariance of f and the properties of R, making clear that the generality is within the class of regularizers admitting affine-equivariant proximals. revision: yes
Referee: [Synthetic examples section] Synthetic examples section: the constructive examples should explicitly state whether the generated regularizers are invariant under the affine group or whether non-invariant R can still be represented exactly by an AE-LPN; if only the invariant subclass works, this directly tests the weakest assumption that both exactness and equivariance can coexist without one property limiting the other.

Authors: We agree with the referee that the synthetic examples would be improved by an explicit discussion of this point. In the constructive examples, the regularizers are generated to be invariant under the affine group actions, which ensures that their proximal operators can be exactly equivariant while satisfying the proximal operator properties. Non-invariant regularizers generally cannot have exactly affine-equivariant proximal operators, as this would violate the subdifferential relation for general transformations. This does not limit the practical utility, as the invariant class still allows for rich, data-driven, and non-convex regularizers learned via the AE-LPN parametrization. We will revise the synthetic examples section to explicitly state that the examples use invariant regularizers and to briefly explain why this is necessary for the coexistence of exactness and equivariance. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation extends LPNs via independent architectural constraints

full rationale

The paper's central claim is that neural networks can be parametrized to compute exact proximal operators for data-driven regularizers while enforcing affine equivariance. No equations, definitions, or self-citations in the abstract or visible text reduce this result to a tautology, fitted input, or self-referential premise. The extension from prior LPN work is presented as adding structural constraints (shift/scale equivariance) whose compatibility with exact prox computation is demonstrated on synthetic and real data, without the result being forced by construction or by load-bearing self-citation. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the existence of neural-network parametrizations that jointly satisfy the proximal operator fixed-point property and affine-equivariance; the paper introduces AE-LPNs as the new entity realizing this.

axioms (1)

standard math Proximal operators are defined via the argmin of a regularized objective and satisfy standard properties such as being firmly nonexpansive.
This is the mathematical foundation for claiming that a learned function computes an exact proximal.

invented entities (1)

Affine-Equivariant Learned Proximal Network (AE-LPN) no independent evidence
purpose: A neural-network parametrization that computes exact proximal operators while enforcing equivariance to shifts and scaling.
New architecture introduced to combine exact proximal computation with affine equivariance.

pith-pipeline@v0.9.0 · 5460 in / 1286 out tokens · 48350 ms · 2026-05-10T11:00:27.672216+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

IEEE Transactions on Pattern Analysis and Machine Intelli- gence33(5), 898–916 (2011).https://doi.org/10.1109/TPAMI.2010.161

ISSN 0162-8828. doi: 10.1109/TPAMI.2010.161. URLhttp://dx.doi.org/10.1109/ TPAMI.2010.161. Martin Benning and Martin Burger. Modern regularization methods for inverse problems, 2018. URLhttps://arxiv.org/abs/1801.09922. Alberto Bietti and Julien Mairal. Group invariance, stability to deformations, and complexity of deep convolutional representations.Journ...

work page doi:10.1109/tpami.2010.161 2010
[2]

, author Boyd, S

ISSN 2167-3888. doi: 10.1561/2400000003. URLhttps://doi.org/10.1561/2400000003. Yaniv Romano, Michael Elad, and Peyman Milanfar. The little engine that could: Regularization by denoising (red), 2017. URLhttps://arxiv.org/abs/1611.02862. Singanallur V. Venkatakrishnan, Charles A. Bouman, and Brendt Wohlberg. Plug-and-play priors for model based reconstruct...

work page doi:10.1561/2400000003 2017

[1] [1]

IEEE Transactions on Pattern Analysis and Machine Intelli- gence33(5), 898–916 (2011).https://doi.org/10.1109/TPAMI.2010.161

ISSN 0162-8828. doi: 10.1109/TPAMI.2010.161. URLhttp://dx.doi.org/10.1109/ TPAMI.2010.161. Martin Benning and Martin Burger. Modern regularization methods for inverse problems, 2018. URLhttps://arxiv.org/abs/1801.09922. Alberto Bietti and Julien Mairal. Group invariance, stability to deformations, and complexity of deep convolutional representations.Journ...

work page doi:10.1109/tpami.2010.161 2010

[2] [2]

, author Boyd, S

ISSN 2167-3888. doi: 10.1561/2400000003. URLhttps://doi.org/10.1561/2400000003. Yaniv Romano, Michael Elad, and Peyman Milanfar. The little engine that could: Regularization by denoising (red), 2017. URLhttps://arxiv.org/abs/1611.02862. Singanallur V. Venkatakrishnan, Charles A. Bouman, and Brendt Wohlberg. Plug-and-play priors for model based reconstruct...

work page doi:10.1561/2400000003 2017