pith. machine review for the scientific record. sign in

arxiv: 2505.01595 · v2 · submitted 2025-05-02 · 💻 cs.CL · cs.AI· cs.LG

Recognition: unknown

Always Tell Me The Odds: Fine-grained Conditional Probability Estimation

Authors on Pith no claims yet
classification 💻 cs.CL cs.AIcs.LG
keywords probabilityestimationmodelsuncertaintyconditionalestimatesfine-grainedinformation
0
0 comments X
read the original abstract

We present a state-of-the-art model for fine-grained probability estimation of propositions conditioned on context. Recent advances in large language models (LLMs) have significantly enhanced their reasoning capabilities, particularly on well-defined tasks with complete information. However, LLMs continue to struggle with making accurate and well-calibrated probabilistic predictions under uncertainty or partial information. While incorporating uncertainty into model predictions often boosts performance, obtaining reliable estimates of that uncertainty remains understudied. In particular, LLM probability estimates tend to be coarse and biased towards more frequent numbers. Through a combination of human and synthetic data creation and assessment, scaling to larger models, and better supervision, we propose a set of strong and precise probability estimation models. We conduct systematic evaluations across tasks that rely on conditional probability estimation and show that our approach consistently outperforms existing fine-tuned and prompting-based methods by a large margin.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. MoCo: A One-Stop Shop for Model Collaboration Research

    cs.CL 2026-01 accept novelty 6.0

    MoCo supplies a unified library of 26 collaboration strategies and benchmarks demonstrating average outperformance over single models in 61 percent of (model, data) pairs.