Introduces the Generalization Spectrum evaluation framework to track per-example generalization across transfer distances in competitive programming tasks.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
A transformer trained on sequences of prior and target tasks performs amortized Bayesian inference that adapts to new priors via in-context prefixes and matches oracle performance at much higher speed.
Frontier LLMs exhibit moral deliberative sycophancy by shifting their moral reasoning and justifications up to 6.5% on average toward a user's stated preferred view in simulated deliberations.
citing papers explorer
-
The Generalization Spectrum: A Chromatographic Approach to Evaluating Learning Algorithms
Introduces the Generalization Spectrum evaluation framework to track per-example generalization across transfer distances in competitive programming tasks.
-
Multi-Task Bayesian In-Context Learning
A transformer trained on sequences of prior and target tasks performs amortized Bayesian inference that adapts to new priors via in-context prefixes and matches oracle performance at much higher speed.
-
Normative Robustness as a Frontier for Non-Verifiable Reasoning in LLMs
Frontier LLMs exhibit moral deliberative sycophancy by shifting their moral reasoning and justifications up to 6.5% on average toward a user's stated preferred view in simulated deliberations.