MiniCheck : Efficient fact-checking of LLMs on grounding documents

Tang, L · 2024 · arXiv 2404.10774

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration

cs.CL · 2026-04-17 · unverdicted · novelty 7.0

RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.

Whose Story Gets Told? Positionality and Bias in LLM Summaries of Life Narratives

cs.CL · 2026-04-22 · unverdicted · novelty 6.0

A proposed pipeline shows LLMs introduce detectable race and gender biases when summarizing life narratives, creating potential for representational harm in research.

No-Worse Context-Aware Decoding: Preventing Neutral Regression in Context-Conditioned Generation

cs.CL · 2026-04-17 · unverdicted · novelty 6.0

NWCAD uses a two-stream setup with a two-stage gate to prevent accuracy drops on baseline-correct items under non-informative contexts while retaining gains from helpful contexts.

Uncertainty-Aware Web-Conditioned Scientific Fact-Checking

cs.CL · 2026-04-13 · unverdicted · novelty 5.0

An uncertainty-gated fact-checking system decomposes claims atomically, verifies them against context, and selectively searches the web only for uncertain facts, outperforming benchmarks while abstaining on conflicts.

HalluScan: A Systematic Benchmark for Detecting and Mitigating Hallucinations in Instruction-Following LLMs

cs.CL · 2026-05-04 · unverdicted · novelty 4.0

HalluScan benchmark evaluates hallucination detection in LLMs, reporting NLI Verification at AUROC 0.88 and introducing HalluScore (r=0.41 with humans) plus Adaptive Detection Routing for 2x cost savings.

A Survey of Scaling in Large Language Model Reasoning

cs.AI · 2025-04-02 · unverdicted · novelty 3.0

A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.

citing papers explorer

Showing 6 of 6 citing papers.

RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration cs.CL · 2026-04-17 · unverdicted · none · ref 21
RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
Whose Story Gets Told? Positionality and Bias in LLM Summaries of Life Narratives cs.CL · 2026-04-22 · unverdicted · none · ref 162
A proposed pipeline shows LLMs introduce detectable race and gender biases when summarizing life narratives, creating potential for representational harm in research.
No-Worse Context-Aware Decoding: Preventing Neutral Regression in Context-Conditioned Generation cs.CL · 2026-04-17 · unverdicted · none · ref 19
NWCAD uses a two-stream setup with a two-stage gate to prevent accuracy drops on baseline-correct items under non-informative contexts while retaining gains from helpful contexts.
Uncertainty-Aware Web-Conditioned Scientific Fact-Checking cs.CL · 2026-04-13 · unverdicted · none · ref 16
An uncertainty-gated fact-checking system decomposes claims atomically, verifies them against context, and selectively searches the web only for uncertain facts, outperforming benchmarks while abstaining on conflicts.
HalluScan: A Systematic Benchmark for Detecting and Mitigating Hallucinations in Instruction-Following LLMs cs.CL · 2026-05-04 · unverdicted · none · ref 44
HalluScan benchmark evaluates hallucination detection in LLMs, reporting NLI Verification at AUROC 0.88 and introducing HalluScore (r=0.41 with humans) plus Adaptive Detection Routing for 2x cost savings.
A Survey of Scaling in Large Language Model Reasoning cs.AI · 2025-04-02 · unverdicted · none · ref 195
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.

MiniCheck : Efficient fact-checking of LLMs on grounding documents

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer