arXiv preprint arXiv:2401.06855 (2024)

· 2024 · arXiv 2401.06855

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

representative citing papers

MedHal-Loc: Are "Explainable-by-Architecture" Medical Hallucination Detectors Faithful Localizers? A Localization Benchmark

cs.CL · 2026-06-19 · unverdicted · novelty 7.0

MedHal-Loc benchmark shows KG-triple hallucination detectors localize errors no better than chance on controlled medical statements due to entity extraction limits, while NLI and consistency methods succeed above chance, and real hallucinations are mostly diffuse conclusion changes.

K-FinHallu: A Hallucination Detection Benchmark for Multi-Turn RAG in Korean Finance

cs.LG · 2026-05-28 · unverdicted · novelty 7.0

K-FinHallu is the first multi-turn Korean financial RAG hallucination benchmark; frontier LLMs struggle especially on justified abstention while an 8B fine-tuned model reaches competitive performance.

Towards Localized and Disentangled Knowledge Editing for Multimodal Large Language Models

cs.CL · 2026-05-28 · unverdicted · novelty 6.0

LDKE framework localizes fact-specific layers and disentangles inputs to improve generalization and locality in multimodal knowledge editing for MLLMs.

Entropy Distribution as a Fingerprint for Hallucinations in Generative Models

cs.AI · 2026-05-27 · unverdicted · novelty 6.0

Token entropy distributions fingerprint hallucinations in generative models, enabling the Calibrated Entropy Score (CES) for single-pass black-box detection with calibration guarantees via a novel DKW inequality.

ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error Annotations

cs.CL · 2025-09-30 · conditional · novelty 6.0

ReFACT benchmark reveals LLMs show a persistent salient distractor failure mode where 61% of incorrect error span predictions are semantically unrelated to actual errors, persisting across model sizes, and comparative judgment yields lower F1 than independent detection.

MultiHaluDet: Multilingual Hallucination Detection via LLM Hidden State Probing

cs.CL · 2026-05-24 · unverdicted · novelty 5.0

MultiHaluDet uses multi-layer hidden-state probing, multi-scale attention, and a calibrated classifier ensemble to detect multilingual hallucinations, reporting up to 98.55% AUROC on English benchmarks and strong cross-lingual transfer to French, Bangla, and Amharic.

Steering the Verifiability of Multimodal AI Hallucinations

cs.AI · 2026-04-08 · unverdicted · novelty 5.0

Researchers create a human-labeled dataset of obvious and elusive multimodal hallucinations and use learned activation-space probes to control their verifiability in MLLMs.

Multilingual Fact-Checking at Scale: Fine-Tuned Compact Models vs LLMs

cs.CL · 2026-06-07 · unverdicted · novelty 4.0

Fine-tuned compact models achieve strong multilingual performance and large efficiency gains over LLMs on production data from 114 languages for claim detection and 28 for veracity prediction.

HalluScan: A Systematic Benchmark for Detecting and Mitigating Hallucinations in Instruction-Following LLMs

cs.CL · 2026-05-04 · unverdicted · novelty 4.0 · 2 refs

HalluScan benchmark evaluates hallucination detection in LLMs, reporting NLI Verification at AUROC 0.88 and introducing HalluScore (r=0.41 with humans) plus Adaptive Detection Routing for 2x cost savings.

citing papers explorer

Showing 9 of 9 citing papers.

MedHal-Loc: Are "Explainable-by-Architecture" Medical Hallucination Detectors Faithful Localizers? A Localization Benchmark cs.CL · 2026-06-19 · unverdicted · none · ref 7
MedHal-Loc benchmark shows KG-triple hallucination detectors localize errors no better than chance on controlled medical statements due to entity extraction limits, while NLI and consistency methods succeed above chance, and real hallucinations are mostly diffuse conclusion changes.
K-FinHallu: A Hallucination Detection Benchmark for Multi-Turn RAG in Korean Finance cs.LG · 2026-05-28 · unverdicted · none · ref 21
K-FinHallu is the first multi-turn Korean financial RAG hallucination benchmark; frontier LLMs struggle especially on justified abstention while an 8B fine-tuned model reaches competitive performance.
Towards Localized and Disentangled Knowledge Editing for Multimodal Large Language Models cs.CL · 2026-05-28 · unverdicted · none · ref 4
LDKE framework localizes fact-specific layers and disentangles inputs to improve generalization and locality in multimodal knowledge editing for MLLMs.
Entropy Distribution as a Fingerprint for Hallucinations in Generative Models cs.AI · 2026-05-27 · unverdicted · none · ref 29
Token entropy distributions fingerprint hallucinations in generative models, enabling the Calibrated Entropy Score (CES) for single-pass black-box detection with calibration guarantees via a novel DKW inequality.
ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error Annotations cs.CL · 2025-09-30 · conditional · none · ref 24
ReFACT benchmark reveals LLMs show a persistent salient distractor failure mode where 61% of incorrect error span predictions are semantically unrelated to actual errors, persisting across model sizes, and comparative judgment yields lower F1 than independent detection.
MultiHaluDet: Multilingual Hallucination Detection via LLM Hidden State Probing cs.CL · 2026-05-24 · unverdicted · none · ref 14
MultiHaluDet uses multi-layer hidden-state probing, multi-scale attention, and a calibrated classifier ensemble to detect multilingual hallucinations, reporting up to 98.55% AUROC on English benchmarks and strong cross-lingual transfer to French, Bangla, and Amharic.
Steering the Verifiability of Multimodal AI Hallucinations cs.AI · 2026-04-08 · unverdicted · none · ref 24
Researchers create a human-labeled dataset of obvious and elusive multimodal hallucinations and use learned activation-space probes to control their verifiability in MLLMs.
Multilingual Fact-Checking at Scale: Fine-Tuned Compact Models vs LLMs cs.CL · 2026-06-07 · unverdicted · none · ref 19
Fine-tuned compact models achieve strong multilingual performance and large efficiency gains over LLMs on production data from 114 languages for claim detection and 28 for veracity prediction.
HalluScan: A Systematic Benchmark for Detecting and Mitigating Hallucinations in Instruction-Following LLMs cs.CL · 2026-05-04 · unverdicted · none · ref 42 · 2 links
HalluScan benchmark evaluates hallucination detection in LLMs, reporting NLI Verification at AUROC 0.88 and introducing HalluScore (r=0.41 with humans) plus Adaptive Detection Routing for 2x cost savings.

arXiv preprint arXiv:2401.06855 (2024)

fields

years

verdicts

representative citing papers

citing papers explorer