Derives ODE limits of Adam-DA showing that first- and second-order momentum parameters reverse their convergence roles in zero-sum games compared to minimization, validated on GAN experiments.
Matecon , volume=
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
A novel decoupled method for distributed saddle problems achieves optimal communication complexity via multi-stage residual norm minimization, with a matching lower bound and extension to variational inequalities.
SGD is reformulated via a master equation from discrete updates, producing a discrete Fokker-Planck equation that predicts non-stationary variance growth proportional to learning rate in flat Hessian directions.
SPACO is a new single-loop stochastic algorithm for stochastic nonconvex-concave minimax problems with nonlinear convex coupled constraints that uses penalty smoothing and provides non-asymptotic complexity bounds plus stationarity analysis.
A unified framework for exponential tilting in diffusion and flow models that includes bias-variance decompositions showing finite gradient variance for some methods, norm bounds on adjoint ODEs, and adapted losses with new Crooks and Jarzynski identities.
citing papers explorer
-
Understanding Dynamics of Adam in Zero-Sum Games: An ODE Approach
Derives ODE limits of Adam-DA showing that first- and second-order momentum parameters reverse their convergence roles in zero-sum games compared to minimization, validated on GAN experiments.
-
Efficient Gradient Methods for Distributed Saddle Problems
A novel decoupled method for distributed saddle problems achieves optimal communication complexity via multi-stage residual norm minimization, with a matching lower bound and extension to variational inequalities.
-
Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics
SGD is reformulated via a master equation from discrete updates, producing a discrete Fokker-Planck equation that predicts non-stationary variance growth proportional to learning rate in flat Hessian directions.
-
A Single-Loop Stochastic Gradient Algorithm for Minimax Optimization with Nonlinear Coupled Constraints
SPACO is a new single-loop stochastic algorithm for stochastic nonconvex-concave minimax problems with nonlinear convex coupled constraints that uses penalty smoothing and provides non-asymptotic complexity bounds plus stationarity analysis.
-
A unified perspective on fine-tuning and sampling with diffusion and flow models
A unified framework for exponential tilting in diffusion and flow models that includes bias-variance decompositions showing finite gradient variance for some methods, norm bounds on adjoint ODEs, and adapted losses with new Crooks and Jarzynski identities.