Requiring LICQ/SCS/SOSC everywhere in bilevel optimization is non-prevalent and rigid, while holding almost everywhere is prevalent, but the distinction introduces fundamental difficulties.
Title resolution pending
8 Pith papers cite this work. Polarity classification is still indexing.
years
2026 8representative citing papers
Establishes maximal concentration bounds for stochastic approximation under heavy-tailed Markovian noise, with tails ranging from sub-Gaussian to heavier than Weibull depending on step sizes and contractivity properties, plus a truncation argument for unbounded noise.
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
A diameter criterion tied to a potential function certifies convergence of difference inclusions, enabling discrete proofs for first-order optimization methods with diminishing steps.
The paper establishes the first tilde O(epsilon^{-1}) upper bounds and matching lower bounds for forward-KL-regularized offline contextual bandits under single-policy concentrability in both tabular and general function approximation settings.
Semi-discrete Flow Matching produces terminal assignment regions that are topologically simple (open, simply connected, homeomorphic to the ball under assumption) yet geometrically distinct from optimal transport Laguerre cells, as they can be non-convex with curved boundaries.
A model-free off-policy actor-critic algorithm is constructed for dynamic expectile and CVaR using a surrogate policy gradient without transition perturbation and elicitability-based value learning, with empirical outperformance in risk-averse domains.
SPACO is a new single-loop stochastic algorithm for stochastic nonconvex-concave minimax problems with nonlinear convex coupled constraints that uses penalty smoothing and provides non-asymptotic complexity bounds plus stationarity analysis.
citing papers explorer
-
On the Nature of Regularity Assumptions in Bilevel Optimization with Constrained Lower-level Problem
Requiring LICQ/SCS/SOSC everywhere in bilevel optimization is non-prevalent and rigid, while holding almost everywhere is prevalent, but the distinction introduces fundamental difficulties.
-
Concentration of General Stochastic Approximation Under Heavy-Tailed Markovian Noise
Establishes maximal concentration bounds for stochastic approximation under heavy-tailed Markovian noise, with tails ranging from sub-Gaussian to heavier than Weibull depending on step sizes and contractivity properties, plus a truncation argument for unbounded noise.
-
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
-
Convergence of difference inclusions via a diameter criterion
A diameter criterion tied to a potential function certifies convergence of difference inclusions, enabling discrete proofs for first-order optimization methods with diminishing steps.
-
Fast Rates for Offline Contextual Bandits with Forward-KL Regularization under Single-Policy Concentrability
The paper establishes the first tilde O(epsilon^{-1}) upper bounds and matching lower bounds for forward-KL-regularized offline contextual bandits under single-policy concentrability in both tabular and general function approximation settings.
-
Tessellations of Semi-Discrete Flow Matching
Semi-discrete Flow Matching produces terminal assignment regions that are topologically simple (open, simply connected, homeomorphic to the ball under assumption) yet geometrically distinct from optimal transport Laguerre cells, as they can be non-convex with curved boundaries.
-
Actor-Critic Algorithm for Dynamic Expectile and CVaR
A model-free off-policy actor-critic algorithm is constructed for dynamic expectile and CVaR using a surrogate policy gradient without transition perturbation and elicitability-based value learning, with empirical outperformance in risk-averse domains.
-
A Single-Loop Stochastic Gradient Algorithm for Minimax Optimization with Nonlinear Coupled Constraints
SPACO is a new single-loop stochastic algorithm for stochastic nonconvex-concave minimax problems with nonlinear convex coupled constraints that uses penalty smoothing and provides non-asymptotic complexity bounds plus stationarity analysis.