Predictive uncertainty quantification via risk decompositions for strictly proper scoring rules

11 Nikita Kotelevskii, Maxim Panov · 2025 · arXiv 2402.10727

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning

cs.CL · 2026-04-10 · unverdicted · novelty 5.0

Supervised fine-tuning degrades the correlation between confidence scores and output quality in language models, driven by factors like training distribution similarity rather than true quality.

Rethinking Uncertainty Estimation in LLMs: A Principled Single-Sequence Measure

cs.LG · 2024-12-19 · unverdicted · novelty 5.0

Negative log-likelihood of the greedy-decoded most likely sequence (G-NLL) is a principled single-sequence uncertainty measure for LLMs that achieves state-of-the-art results.

citing papers explorer

Showing 2 of 2 citing papers.

Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning cs.CL · 2026-04-10 · unverdicted · none · ref 12
Supervised fine-tuning degrades the correlation between confidence scores and output quality in language models, driven by factors like training distribution similarity rather than true quality.
Rethinking Uncertainty Estimation in LLMs: A Principled Single-Sequence Measure cs.LG · 2024-12-19 · unverdicted · none · ref 11
Negative log-likelihood of the greedy-decoded most likely sequence (G-NLL) is a principled single-sequence uncertainty measure for LLMs that achieves state-of-the-art results.

Predictive uncertainty quantification via risk decompositions for strictly proper scoring rules

fields

years

verdicts

representative citing papers

citing papers explorer