pith. sign in

Universal supervised lear ning for individual data

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it
abstract

Universal supervised learning is considered from an information theoretic point of view following the universal prediction approach, see Merhav and Feder (1998). We consider the standard supervised "batch" learning where prediction is done on a test sample once the entire training data is observed, and the individual setting where the features and labels, both in the training and test, are specific individual quantities. The information theoretic approach naturally uses the self-information loss or log-loss. Our results provide universal learning schemes that compete with a "genie" (or reference) that knows the true test label. In particular, it is demonstrated that the main proposed scheme, termed Predictive Normalized Maximum Likelihood (pNML), is a robust learning solution that outperforms the current leading approach based on Empirical Risk Minimization (ERM). Furthermore, the pNML construction provides a pointwise indication for the learnability of the specific test challenge with the given training examples

citation-role summary

background 1

citation-polarity summary

fields

cs.IT 1 cs.LG 1

years

2026 1 2024 1

verdicts

UNVERDICTED 2

roles

background 1

polarities

background 1

representative citing papers

Misspecified Universal Learning

cs.IT · 2026-05-11 · unverdicted · novelty 5.0

Minimax regret is characterized for misspecified universal learning with log-loss, yielding the optimal universal learner as a unified framework for any uncertainty in the data-generating process.

citing papers explorer

Showing 2 of 2 citing papers.

  • Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 197 · internal anchor

    SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.

  • Misspecified Universal Learning cs.IT · 2026-05-11 · unverdicted · none · ref 14

    Minimax regret is characterized for misspecified universal learning with log-loss, yielding the optimal universal learner as a unified framework for any uncertainty in the data-generating process.