Lan- guage models are few-shot learners.Advances in neural information processing systems, 33:1877–1901

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al · 1901

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Geometry-Aware Decoding with Wasserstein-Regularized Truncation and Mass Penalties for Large Language Models

cs.CL · 2026-02-10 · unverdicted · novelty 7.0

Top-W applies Wasserstein-regularized truncation on token-embedding geometry to create a closed-form optimal crop for LLM sampling that outperforms prior methods by up to 33.7% on GSM8K, GPQA, AlpacaEval, and MT-Bench.

LLMs Uncertainty Quantification via Adaptive Conformal Semantic Entropy

cs.LG · 2026-05-05 · unverdicted · novelty 5.0

ACSE estimates LLM uncertainty via adaptive semantic entropy clustering with conformal prediction guarantees, reporting higher AUROC than token entropy baselines on datasets like TriviaQA.

citing papers explorer

Showing 2 of 2 citing papers.

Geometry-Aware Decoding with Wasserstein-Regularized Truncation and Mass Penalties for Large Language Models cs.CL · 2026-02-10 · unverdicted · none · ref 3
Top-W applies Wasserstein-regularized truncation on token-embedding geometry to create a closed-form optimal crop for LLM sampling that outperforms prior methods by up to 33.7% on GSM8K, GPQA, AlpacaEval, and MT-Bench.
LLMs Uncertainty Quantification via Adaptive Conformal Semantic Entropy cs.LG · 2026-05-05 · unverdicted · none · ref 2
ACSE estimates LLM uncertainty via adaptive semantic entropy clustering with conformal prediction guarantees, reporting higher AUROC than token entropy baselines on datasets like TriviaQA.

Lan- guage models are few-shot learners.Advances in neural information processing systems, 33:1877–1901

fields

years

verdicts

representative citing papers

citing papers explorer