Llms will always hallucinate, and we need to live with this,

· 2024 · arXiv 2409.05746

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

representative citing papers

Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

DisAAD trains a 1%-sized proxy model via adversarial distillation to quantify uncertainty in black-box LLMs by aligning with their output distributions.

No Test Cases, No Problem: Distillation-Driven Code Generation for Scientific Workflows

cs.SE · 2026-04-25 · unverdicted · novelty 6.0

MOSAIC generates executable scientific code without I/O test cases by combining student-teacher distillation with a consolidated context window to reduce hallucinations across subproblems.

Hallucination is a Consequence of Space-Optimality: A Rate-Distortion Theorem for Membership Testing

cs.LG · 2026-01-31 · unverdicted · novelty 6.0

Hallucinations are the space-optimal behavior for limited-capacity models performing membership testing on sparse facts, as shown by a rate-distortion theorem that equates optimal memory use to minimum KL divergence between fact and non-fact score distributions.

MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding

cs.CL · 2025-10-09 · unverdicted · novelty 6.0

MOSAIC is a training-free multi-agent LLM framework with rationale, coding, reflection, and debugging agents plus a consolidated context window that outperforms prior methods on scientific coding benchmarks.

Hallucinations are inevitable but can be made statistically negligible

cs.CL · 2025-02-15 · unverdicted · novelty 6.0

Hallucinations are inevitable on an infinite set of inputs but can be made statistically negligible with sufficient training data quality and quantity.

Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?

cs.AI · 2025-03-23 · conditional · novelty 5.0

LLMs show accuracy drops of 0.3% to 5.9% on GSM8K math problems when culturally adapted to six countries while keeping math operations identical, with statistical significance confirmed by McNemar tests.

citing papers explorer

Showing 6 of 6 citing papers.

Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation cs.CL · 2026-05-07 · unverdicted · none · ref 53
DisAAD trains a 1%-sized proxy model via adversarial distillation to quantify uncertainty in black-box LLMs by aligning with their output distributions.
No Test Cases, No Problem: Distillation-Driven Code Generation for Scientific Workflows cs.SE · 2026-04-25 · unverdicted · none · ref 2
MOSAIC generates executable scientific code without I/O test cases by combining student-teacher distillation with a consolidated context window to reduce hallucinations across subproblems.
Hallucination is a Consequence of Space-Optimality: A Rate-Distortion Theorem for Membership Testing cs.LG · 2026-01-31 · unverdicted · none · ref 1
Hallucinations are the space-optimal behavior for limited-capacity models performing membership testing on sparse facts, as shown by a rate-distortion theorem that equates optimal memory use to minimum KL divergence between fact and non-fact score distributions.
MOSAIC: Multi-agent Orchestration for Task-Intelligent Scientific Coding cs.CL · 2025-10-09 · unverdicted · none · ref 47
MOSAIC is a training-free multi-agent LLM framework with rationale, coding, reflection, and debugging agents plus a consolidated context window that outperforms prior methods on scientific coding benchmarks.
Hallucinations are inevitable but can be made statistically negligible cs.CL · 2025-02-15 · unverdicted · none · ref 46
Hallucinations are inevitable on an infinite set of inputs but can be made statistically negligible with sufficient training data quality and quantity.
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts? cs.AI · 2025-03-23 · conditional · none · ref 26
LLMs show accuracy drops of 0.3% to 5.9% on GSM8K math problems when culturally adapted to six countries while keeping math operations identical, with statistical significance confirmed by McNemar tests.

Llms will always hallucinate, and we need to live with this,

fields

years

verdicts

representative citing papers

citing papers explorer