Top-W applies Wasserstein-regularized truncation on token-embedding geometry to create a closed-form optimal crop for LLM sampling that outperforms prior methods by up to 33.7% on GSM8K, GPQA, AlpacaEval, and MT-Bench.
Lan- guage models are few-shot learners.Advances in neural information processing systems, 33:1877–1901
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
ACSE estimates LLM uncertainty via adaptive semantic entropy clustering with conformal prediction guarantees, reporting higher AUROC than token entropy baselines on datasets like TriviaQA.
citing papers explorer
-
Geometry-Aware Decoding with Wasserstein-Regularized Truncation and Mass Penalties for Large Language Models
Top-W applies Wasserstein-regularized truncation on token-embedding geometry to create a closed-form optimal crop for LLM sampling that outperforms prior methods by up to 33.7% on GSM8K, GPQA, AlpacaEval, and MT-Bench.
-
LLMs Uncertainty Quantification via Adaptive Conformal Semantic Entropy
ACSE estimates LLM uncertainty via adaptive semantic entropy clustering with conformal prediction guarantees, reporting higher AUROC than token entropy baselines on datasets like TriviaQA.