arXiv preprint arXiv:2512.20949 , year=

Neural probebased hallucination detection for large language models · arXiv 2512.20949

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts

cs.CL · 2026-05-16 · unverdicted · novelty 6.0

Benchmark construction artifacts in hallucination detection corpora allow naive text-similarity baselines to achieve near-perfect scores, and controlled evaluations show most methods perform near chance except SAPLMA and the new DRIFT probe.

Sparse Reward Subsystem in Large Language Models

cs.CL · 2026-02-01 · unverdicted · novelty 6.0

LLM hidden states contain a sparse reward subsystem consisting of value neurons that predict state value and dopamine neurons that encode step-level temporal difference errors.

MultiHaluDet: Multilingual Hallucination Detection via LLM Hidden State Probing

cs.CL · 2026-05-24 · unverdicted · novelty 5.0

MultiHaluDet uses multi-layer hidden-state probing, multi-scale attention, and a calibrated classifier ensemble to detect multilingual hallucinations, reporting up to 98.55% AUROC on English benchmarks and strong cross-lingual transfer to French, Bangla, and Amharic.

citing papers explorer

Showing 3 of 3 citing papers.

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts cs.CL · 2026-05-16 · unverdicted · none · ref 56
Benchmark construction artifacts in hallucination detection corpora allow naive text-similarity baselines to achieve near-perfect scores, and controlled evaluations show most methods perform near chance except SAPLMA and the new DRIFT probe.
Sparse Reward Subsystem in Large Language Models cs.CL · 2026-02-01 · unverdicted · none · ref 16
LLM hidden states contain a sparse reward subsystem consisting of value neurons that predict state value and dopamine neurons that encode step-level temporal difference errors.
MultiHaluDet: Multilingual Hallucination Detection via LLM Hidden State Probing cs.CL · 2026-05-24 · unverdicted · none · ref 13
MultiHaluDet uses multi-layer hidden-state probing, multi-scale attention, and a calibrated classifier ensemble to detect multilingual hallucinations, reporting up to 98.55% AUROC on English benchmarks and strong cross-lingual transfer to French, Bangla, and Amharic.

arXiv preprint arXiv:2512.20949 , year=

fields

years

verdicts

representative citing papers

citing papers explorer