RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
arXiv preprint arXiv:2502.17125 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
LUCID detects hallucinations in LLM-KG reasoning by extracting node/edge features from attention and semantics then integrating them with KG structure in a GNN, achieving SOTA on nine new benchmark datasets versus 15 baselines.
Human adjudication of conflicts between original benchmark labels and LLM predictions on QAGS-C and SummEval increases triple agreement by 6-8% and LLM accuracy by 2-9%, with adjudicators often siding with models that provide explicit reasoning.
citing papers explorer
-
RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration
RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
-
Detecting Hallucinations for Large Language Model-based Knowledge Graph Reasoning
LUCID detects hallucinations in LLM-KG reasoning by extracting node/edge features from attention and semantics then integrating them with KG structure in a GNN, achieving SOTA on nine new benchmark datasets versus 15 baselines.
-
Do Benchmarks Underestimate LLM Performance? Evaluating Hallucination Detection With LLM-First Human-Adjudicated Assessment
Human adjudication of conflicts between original benchmark labels and LLM predictions on QAGS-C and SummEval increases triple agreement by 6-8% and LLM accuracy by 2-9%, with adjudicators often siding with models that provide explicit reasoning.