RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation
Pith reviewed 2026-05-15 00:22 UTC · model grok-4.3
The pith
Randomized mid-point sampling produces unbiased extragradient updates for variational inequalities with O(1/k) convergence
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Extragradient may suffer from discretization bias when applied to non-linear vector fields. RAMPAGE resolves this via randomized mid-point sampling to achieve unbiased updates, while RAMPAGE+ leverages antithetic sampling to act as an unbiased geometric path-integrator that completely removes internal first-order terms from the variance, yielding provable O(1/k) convergence for root finding under co-coercive, co-hypomonotone, and generalized Lipschitzness regimes as well as for stochastic and deterministic smooth convex-concave games.
What carries the argument
Randomized mid-point sampling in the extrapolation step, which evaluates the vector field at a stochastic convex combination between the current iterate and its extrapolated point to debias the update.
If this is right
- Unbiased updates prevent accumulation of discretization error over many iterations.
- RAMPAGE+ removes first-order variance contributions through negative correlation.
- O(1/k) rates hold for root finding in co-coercive, co-hypomonotone, and generalized Lipschitz regimes.
- Symmetrically scaled variants extend the approach to constrained variational inequalities.
- The methods apply to both stochastic and deterministic smooth convex-concave games with deterministic bounds in several cases.
Where Pith is reading between the lines
- The geometric path-integrator view may suggest similar debiasing for other first-order discretization schemes in numerical ODE methods.
- Antithetic sampling could combine with momentum or other accelerators to further lower variance in high-dimensional equilibrium problems.
- Empirical runs on quadratic and mildly nonlinear monotone operators would directly measure residual bias after many steps.
Load-bearing premise
That randomized mid-point sampling with antithetic variance reduction produces unbiased updates and removes first-order variance terms without introducing new biases under the stated co-coercivity and Lipschitz regimes.
What would settle it
A numerical test or closed-form calculation on a simple nonlinear co-coercive operator showing that the expected RAMPAGE update does not match the continuous vector field or converges to the wrong fixed point.
read the original abstract
A celebrated method for Variational Inequalities (VIs) is Extragradient (EG), which can be viewed as a standard discrete-time integration scheme. With this view in mind, in this paper we show that EG may suffer from discretization bias when applied to non-linear vector fields, conservative or otherwise. To resolve this discretization shortcoming, we introduce RAndomized Mid-Point for debiAsed Gradient Extrapolation (RAMPAGE) and its variance-reduced counterpart, RAMPAGE+, which leverages antithetic sampling. In contrast with EG, both methods are unbiased. Furthermore, leveraging negative correlation, RAMPAGE+ acts as an unbiased, geometric path-integrator that completely removes internal first-order terms from the variance, provably improving upon RAMPAGE. We further demonstrate that both methods enjoy provable $\mathcal{O}(1/k)$ convergence guarantees for a range of problems including root finding under co-coercive, co-hypomonotone, and generalized Lipschitzness regimes. Furthermore, we introduce symmetrically scaled variants to extend our results to constrained VIs. Finally, we provide convergence guarantees of both methods for stochastic and deterministic smooth convex-concave games. Somewhat interestingly, despite being a randomized method, RAMPAGE+ attains purely deterministic bounds for a number of the studied settings.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that Extragradient (EG) incurs discretization bias on non-linear vector fields when viewed as a discrete integration scheme for variational inequalities. It introduces RAMPAGE, which employs randomized mid-point sampling to produce unbiased gradient extrapolations, and RAMPAGE+, which augments this with antithetic sampling to cancel first-order variance contributions while remaining unbiased. Both methods are asserted to achieve O(1/k) convergence for root-finding under co-coercive, co-hypomonotone, and generalized Lipschitz regimes, with extensions to symmetrically scaled variants for constrained VIs and to stochastic/deterministic smooth convex-concave games; RAMPAGE+ is further claimed to deliver deterministic bounds despite randomization.
Significance. If the unbiasedness via linearity of expectation and the variance cancellation hold, the work supplies a principled Monte Carlo debiasing technique that strengthens convergence analysis for non-linear VIs beyond standard EG. The deterministic guarantees for a randomized method and the coverage of multiple regimes (including games) represent concrete strengths that could influence practical algorithm design in optimization.
major comments (2)
- [§3] §3, unbiasedness argument: the claim that randomized mid-point sampling yields an unbiased estimator of the path integral relies on linearity of expectation independent of F; however, for non-conservative fields the integration path must be explicitly specified to ensure the expectation equals the continuous operator without residual discretization error.
- [Theorem 4.1] Theorem 4.1 (O(1/k) rate under co-hypomonotonicity): the proof sketch invokes the variance reduction of RAMPAGE+ to tighten the bound, but the step from negative correlation to complete removal of first-order terms requires an explicit variance expansion (analogous to Eq. (12) for RAMPAGE) to confirm no higher-order residuals remain under the stated Lipschitz regime.
minor comments (3)
- [§5] The symmetrically scaled variants for constrained VIs are introduced in §5 but lack a clear statement of how the scaling parameter is selected (e.g., via projection or normalization) and whether it preserves the unbiasedness property.
- Notation for the generalized Lipschitzness regime should be aligned with standard references (e.g., explicit constant L vs. local Lipschitz) to avoid ambiguity when comparing to co-coercive assumptions.
- [Figure 2] Figure 2 (convergence plots) would benefit from error bars or multiple random seeds to illustrate the variance reduction claimed for RAMPAGE+ over RAMPAGE.
Simulated Author's Rebuttal
We thank the referee for their positive assessment and constructive comments, which help strengthen the presentation of our unbiased debiasing approach. We address each major comment below and will incorporate the suggested clarifications and expansions in the revised manuscript.
read point-by-point responses
-
Referee: [§3] §3, unbiasedness argument: the claim that randomized mid-point sampling yields an unbiased estimator of the path integral relies on linearity of expectation independent of F; however, for non-conservative fields the integration path must be explicitly specified to ensure the expectation equals the continuous operator without residual discretization error.
Authors: We agree that the integration path should be stated explicitly for rigor, particularly when the vector field is non-conservative. In the revised manuscript we will define the path as the straight-line segment from the current point x_k to the extrapolated point x_k + γ F(x_k). Using only linearity of expectation, we will prove that the expectation of the randomized midpoint estimator equals the line integral of F along this segment, with no residual discretization bias in the mean. The argument does not rely on path independence or conservativeness and therefore holds for general Lipschitz fields. revision: yes
-
Referee: [Theorem 4.1] Theorem 4.1 (O(1/k) rate under co-hypomonotonicity): the proof sketch invokes the variance reduction of RAMPAGE+ to tighten the bound, but the step from negative correlation to complete removal of first-order terms requires an explicit variance expansion (analogous to Eq. (12) for RAMPAGE) to confirm no higher-order residuals remain under the stated Lipschitz regime.
Authors: We appreciate the suggestion to make the variance analysis fully explicit. In the revision we will insert a detailed variance expansion for RAMPAGE+ (modeled on Eq. (12) for RAMPAGE) that isolates the cross term arising from antithetic sampling. Under the generalized Lipschitz assumption this cross term exactly cancels the leading first-order variance contribution, leaving only O(γ²) residuals that are absorbed into the existing O(1/k) bound. The expanded derivation confirms that no uncancelled first-order terms survive, thereby justifying the tightened rate stated in Theorem 4.1. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper establishes unbiasedness of the RAMPAGE estimator directly from linearity of expectation applied to randomized midpoint sampling of the vector field, a standard Monte Carlo identity that holds independently of the field's linearity or the target result. Variance reduction in RAMPAGE+ follows from explicit cancellation of first-order terms under antithetic sampling, again by direct algebraic expansion without parameter fitting or self-referential definitions. O(1/k) convergence rates are then obtained as standard extensions once unbiasedness is granted, under the stated co-coercivity, co-hypomonotonicity, and Lipschitz assumptions; these steps do not reduce to fitted inputs, self-citations, or ansatzes imported from prior work by the same authors. The argument structure is self-contained against external benchmarks such as classical extragradient analysis.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Vector field satisfies co-coercivity, co-hypomonotonicity, or generalized Lipschitzness
- domain assumption Smooth convex-concave structure for game settings
Forward citations
Cited by 1 Pith paper
-
Unified High-Probability Analysis of Stochastic Variance-Reduced Estimation
A unified recursion framework for stochastic variance-reduced estimation yields high-probability bounds and the first Õ(ε^{-3}) oracle complexity for stochastic optimization with expectation constraints.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.