Generalization is a testable hedging property of the learner's response law, recovered via f-divergence regularizers that induce information-geometric curves between training loss and sample dependence.
SGDR: Stochastic Gradient Descent with Warm Restarts
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A transformer-encoded spherical normalizing flow achieves state-of-the-art angular resolution for IceCube neutrino tracks and showers, improving median resolution by factors of 1.3-2.5 over B-spline likelihoods at 100 TeV and outperforming prior ML methods for muons.
DynamiCS dynamically scales semantic clusters per training epoch to reduce VLM pre-training compute while improving accuracy on long-tail concepts compared to static or flattening baselines.
citing papers explorer
-
Bounded-Rationality, Hedging, and Generalization
Generalization is a testable hedging property of the learner's response law, recovered via f-divergence regularizers that induce information-geometric curves between training loss and sample dependence.
-
Neural posterior estimation of the neutrino direction in IceCube using transformer-encoded normalizing flows on the sphere
A transformer-encoded spherical normalizing flow achieves state-of-the-art angular resolution for IceCube neutrino tracks and showers, improving median resolution by factors of 1.3-2.5 over B-spline likelihoods at 100 TeV and outperforming prior ML methods for muons.
-
Dynamic Cluster Data Sampling for Efficient and Long-Tail-Aware Vision-Language Pre-training
DynamiCS dynamically scales semantic clusters per training epoch to reduce VLM pre-training compute while improving accuracy on long-tail concepts compared to static or flattening baselines.