MedHal-Loc benchmark shows KG-triple hallucination detectors localize errors no better than chance on controlled medical statements due to entity extraction limits, while NLI and consistency methods succeed above chance, and real hallucinations are mostly diffuse conclusion changes.
arXiv preprint arXiv:2401.06855 (2024)
9 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
K-FinHallu is the first multi-turn Korean financial RAG hallucination benchmark; frontier LLMs struggle especially on justified abstention while an 8B fine-tuned model reaches competitive performance.
LDKE framework localizes fact-specific layers and disentangles inputs to improve generalization and locality in multimodal knowledge editing for MLLMs.
Token entropy distributions fingerprint hallucinations in generative models, enabling the Calibrated Entropy Score (CES) for single-pass black-box detection with calibration guarantees via a novel DKW inequality.
ReFACT benchmark reveals LLMs show a persistent salient distractor failure mode where 61% of incorrect error span predictions are semantically unrelated to actual errors, persisting across model sizes, and comparative judgment yields lower F1 than independent detection.
MultiHaluDet uses multi-layer hidden-state probing, multi-scale attention, and a calibrated classifier ensemble to detect multilingual hallucinations, reporting up to 98.55% AUROC on English benchmarks and strong cross-lingual transfer to French, Bangla, and Amharic.
Researchers create a human-labeled dataset of obvious and elusive multimodal hallucinations and use learned activation-space probes to control their verifiability in MLLMs.
Fine-tuned compact models achieve strong multilingual performance and large efficiency gains over LLMs on production data from 114 languages for claim detection and 28 for veracity prediction.
HalluScan benchmark evaluates hallucination detection in LLMs, reporting NLI Verification at AUROC 0.88 and introducing HalluScore (r=0.41 with humans) plus Adaptive Detection Routing for 2x cost savings.
citing papers explorer
-
K-FinHallu: A Hallucination Detection Benchmark for Multi-Turn RAG in Korean Finance
K-FinHallu is the first multi-turn Korean financial RAG hallucination benchmark; frontier LLMs struggle especially on justified abstention while an 8B fine-tuned model reaches competitive performance.