Derives ODE limits of Adam-DA showing that first- and second-order momentum parameters reverse their convergence roles in zero-sum games compared to minimization, validated on GAN experiments.
Neural computation , volume=
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
DPS quantifies deviation of per-sample decision patterns from class averages and shows linear correlation with generalization gaps while unifying degradation scenarios into a continuous trajectory.
Optimizer choice induces distinct connected regions in the loss landscape of two-layer ReLU networks, with AdamW and Muon sometimes separated by provable barriers.
DiMS is a physics-inspired dynamical sampler guaranteed to exactly sample reparameterization-invariant minimum level sets in neural network loss landscapes.
EPGS detects high-confidence factual errors in LLMs by using embedding perturbations to measure gradient sensitivity as a proxy for sharp versus flat minima.
citing papers explorer
-
Understanding Dynamics of Adam in Zero-Sum Games: An ODE Approach
Derives ODE limits of Adam-DA showing that first- and second-order momentum parameters reverse their convergence roles in zero-sum games compared to minimization, validated on GAN experiments.
-
Understanding Generalization through Decision Pattern Shift
DPS quantifies deviation of per-sample decision patterns from class averages and shows linear correlation with generalization gaps while unifying degradation scenarios into a continuous trajectory.
-
Optimizer-Induced Mode Connectivity: From AdamW to Muon
Optimizer choice induces distinct connected regions in the loss landscape of two-layer ReLU networks, with AdamW and Muon sometimes separated by provable barriers.
-
Don't Stop Me Yet: Sampling Loss Minima via Dissipative Riemannian Mechanics
DiMS is a physics-inspired dynamical sampler guaranteed to exactly sample reparameterization-invariant minimum level sets in neural network loss landscapes.
-
From Flat Facts to Sharp Hallucinations: Detecting Stubborn Errors via Gradient Sensitivity
EPGS detects high-confidence factual errors in LLMs by using embedding perturbations to measure gradient sensitivity as a proxy for sharp versus flat minima.