Flat minima are illusory; generalization is driven by weakness, a reparameterization-invariant measure of compatible completions that predicts performance better than sharpness on MNIST and Fashion-MNIST.
Trends in Cognitive Sciences , year =
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Kuramoto synchronization dynamics implement a provably unique and globally attractive attention mechanism that replaces softmax for physical substrates and shows competitive empirical performance.
LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.
citing papers explorer
-
Are Flat Minima an Illusion?
Flat minima are illusory; generalization is driven by weakness, a reparameterization-invariant measure of compatible completions that predicts performance better than sharpness on MNIST and Fashion-MNIST.
-
Attention by Synchronization in Coupled Oscillator Networks
Kuramoto synchronization dynamics implement a provably unique and globally attractive attention mechanism that replaces softmax for physical substrates and shows competitive empirical performance.
-
Hypothesis generation and updating in large language models
LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.