The voting curve from repeated binary predictions is exactly equivalent to a signed voting signature capturing excess latent mass above the majority threshold at binomial variance scales, via signed Hausdorff moments.
Generating with confidence: Uncertainty quantification for black-box large language models.Transactions on Machine Learning Research, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
First-token normalized entropy (phi_first) from one greedy decode reaches mean AUROC 0.820 for hallucination detection, matching or exceeding semantic self-consistency (0.793) and surface self-consistency (0.791) across three 7-8B models and two benchmarks.
citing papers explorer
-
When Can Voting Help, Hurt, or Change Course? Exact Structure of Binary Test-Time Aggregation
The voting curve from repeated binary predictions is exactly equivalent to a signed voting signature capturing excess latent mass above the majority threshold at binomial variance scales, via signed Hausdorff moments.
-
The First Token Knows: Single-Decode Confidence for Hallucination Detection
First-token normalized entropy (phi_first) from one greedy decode reaches mean AUROC 0.820 for hallucination detection, matching or exceeding semantic self-consistency (0.793) and surface self-consistency (0.791) across three 7-8B models and two benchmarks.