Introduces unified span-level hallucination detection benchmark over code, tool output, and documents; fine-tuned Qwen3.5-2B reaches 0.689 span-F1 and outperforms baselines including on code-agent data.
RAG - HAT : A Hallucination-Aware Tuning Pipeline for LLM in Retrieval-Augmented Generation
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Beyond Document Grounding: Span-Level Hallucination Detection over Code, Tool Output, and Documents
Introduces unified span-level hallucination detection benchmark over code, tool output, and documents; fine-tuned Qwen3.5-2B reaches 0.689 span-F1 and outperforms baselines including on code-agent data.