Adam's adaptive preconditioning and first-moment averaging improve high-probability tracking error in noise-dominated nonstationary regimes but can increase it under strong drift, where SGD achieves a smaller floor, with explicit beta-dependent bounds.
Neural Networks , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
A reinforcement learning policy learns to adaptively harvest data samples, improving empirical constraint satisfaction and training efficiency for Lyapunov NNs and PINNs.
citing papers explorer
-
Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization
Adam's adaptive preconditioning and first-moment averaging improve high-probability tracking error in noise-dominated nonstationary regimes but can increase it under strong drift, where SGD achieves a smaller floor, with explicit beta-dependent bounds.
-
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
-
Adaptive Data Harvesting for Efficient Neural Network Learning with Universal Constraints
A reinforcement learning policy learns to adaptively harvest data samples, improving empirical constraint satisfaction and training efficiency for Lyapunov NNs and PINNs.