Universal supervised lear ning for individual data

Y aniv Fogel, Meir Feder, “Universal supervised lear ning for individual data,” arXiv preprint arXiv:1812 · 2018 · cs.IT · arXiv 1812.09520

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Universal supervised learning is considered from an information theoretic point of view following the universal prediction approach, see Merhav and Feder (1998). We consider the standard supervised "batch" learning where prediction is done on a test sample once the entire training data is observed, and the individual setting where the features and labels, both in the training and test, are specific individual quantities. The information theoretic approach naturally uses the self-information loss or log-loss. Our results provide universal learning schemes that compete with a "genie" (or reference) that knows the true test label. In particular, it is demonstrated that the main proposed scheme, termed Predictive Normalized Maximum Likelihood (pNML), is a robust learning solution that outperforms the current leading approach based on Empirical Risk Minimization (ERM). Furthermore, the pNML construction provides a pointwise indication for the learnability of the specific test challenge with the given training examples

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Training Language Models to Self-Correct via Reinforcement Learning

cs.LG · 2024-09-19 · unverdicted · novelty 6.0

SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.

Misspecified Universal Learning

cs.IT · 2026-05-11 · unverdicted · novelty 5.0

Minimax regret is characterized for misspecified universal learning with log-loss, yielding the optimal universal learner as a unified framework for any uncertainty in the data-generating process.

citing papers explorer

Showing 2 of 2 citing papers.

Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 197 · internal anchor
SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.
Misspecified Universal Learning cs.IT · 2026-05-11 · unverdicted · none · ref 14
Minimax regret is characterized for misspecified universal learning with log-loss, yielding the optimal universal learner as a unified framework for any uncertainty in the data-generating process.

Universal supervised lear ning for individual data

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer