SGD is reformulated via a master equation from discrete updates, producing a discrete Fokker-Planck equation that predicts non-stationary variance growth proportional to learning rate in flat Hessian directions.
International Conference on Machine Learning , pages=
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Characterizes spurious correlation mechanisms in preference optimization via mean spurious bias and causal-spurious correlation leakage, demonstrates irreducible vulnerability to distribution shift, and introduces tie training as selective mitigation with validation on log-linear models and empirica
Evolutionary game theory shows gradient descent and stochastic gradient descent drive neural networks to distinct stable states favoring shortcut or core subnetworks, with data and optimization noise shaping shortcut bias formation.
Develops a margin-adaptive learned confidence estimator for LLMs with generalization guarantees to improve agreement rates with human judgments over heuristic baselines.
citing papers explorer
-
Why SGD is not Brownian Motion: A New Perspective on Stochastic Dynamics
SGD is reformulated via a master equation from discrete updates, producing a discrete Fokker-Planck equation that predicts non-stationary variance growth proportional to learning rate in flat Hessian directions.
-
Spurious Correlation Learning in Preference Optimization: Mechanisms, Consequences, and Mitigation via Tie Training
Characterizes spurious correlation mechanisms in preference optimization via mean spurious bias and causal-spurious correlation leakage, demonstrates irreducible vulnerability to distribution shift, and introduces tie training as selective mitigation with validation on log-linear models and empirica
-
Deciphering Shortcut Learning from an Evolutionary Game Theory Perspective
Evolutionary game theory shows gradient descent and stochastic gradient descent drive neural networks to distinct stable states favoring shortcut or core subnetworks, with data and optimization noise shaping shortcut bias formation.
-
Margin-Adaptive Confidence Ranking for Reliable LLM Judgement
Develops a margin-adaptive learned confidence estimator for LLMs with generalization guarantees to improve agreement rates with human judgments over heuristic baselines.