pith. sign in

arxiv: 2602.15136 · v2 · pith:XOCHYLDHnew · submitted 2026-02-16 · 📊 stat.ML · cs.LG

Universal priors: solving empirical Bayes via Bayesian inference and pretraining

classification 📊 stat.ML cs.LG
keywords testtrainingbayesbayesiandistributionsempiricallengthposterior
0
0 comments X
read the original abstract

We theoretically justify the recent empirical finding of [Teh et al., 2025] that a transformer pretrained on synthetically generated data achieves strong performance on empirical Bayes (EB) problems. We take an indirect approach to this question: rather than analyzing the model architecture or training dynamics, we ask why a pretrained Bayes estimator, trained under a prespecified training distribution, can adapt to arbitrary test distributions. Focusing on Poisson EB problems, we identify the existence of universal priors such that training under these priors yields a near-optimal regret bound of $\widetilde{O}(\frac{1}{n})$ uniformly over all test distributions. Our analysis leverages the classical phenomenon of posterior contraction in Bayesian statistics, showing that the pretrained transformer adapts to unknown test distributions precisely through posterior contraction. This perspective also explains the phenomenon of length generalization, in which the test sequence length exceeds the training length, as the model performs Bayesian inference using a generalized posterior.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Poisson Empirical Bayes via Gamma-Smoothed Nonparametric Maximum Likelihood

    math.ST 2026-05 unverdicted novelty 7.0

    A Gamma-smoothed NPMLE for Poisson empirical Bayes achieves optimal nearly parametric rates for posterior means and enables asymptotically exact, shorter marginal coverage confidence sets under compact support.

  2. Quasi-Bayes empirical Bayes estimation of sums of random variables

    stat.ME 2026-06 unverdicted novelty 6.0

    A nonparametric quasi-Bayes empirical Bayes procedure is proposed for estimating sums of random variables, with recursive mixing distribution estimation, asymptotic guarantees, and uncertainty quantification.

  3. Merging of Bayes and quasi-Bayes empirical Bayes procedures for Poisson compound decisions

    stat.ME 2026-07 unverdicted novelty 5.0

    Proves frequentist merging of Bayesian (Dirichlet process) and quasi-Bayesian (Newton's algorithm) empirical Bayes estimators for Poisson compound decisions via concentration rates on marginal PMFs and excess risks, w...