S., Turc, I., and Reitter, D

Measuring Attribution in Natural Language Generation Models · 2023 · DOI 10.1162/coli_a_00486

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

open at publisher browse 8 citing papers

representative citing papers

Verified Misguidance: Measuring Structural Citation Failures in Search-Augmented LLMs

cs.DL · 2026-05-27 · unverdicted · novelty 7.0

CITETRACE dataset and evaluation framework show 30.6% of citations distort sources and 27.1% use domain-inappropriate sources in search-augmented LLMs, with provider differences explaining 88-96% of quality variance.

Evaluating Commercial AI Chatbots as News Intermediaries

cs.CL · 2026-05-21 · conditional · novelty 7.0

Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.

The Warrant Gap: Claim-Conditioned Re-scoring for Fact-Checking

cs.CL · 2026-06-23 · unverdicted · novelty 6.0

Introduces claim-conditioned re-scoring (SIFT) and warranted supports proportion (WSP) metric, reporting accuracy recovery up to 27.6 points and WSP calibration at AUC 0.92 on FEVER, SciFact and other benchmarks.

Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models

cs.AI · 2025-06-21 · conditional · novelty 6.0

Active Indexing with synthetic data augmentation for bidirectional fact-source binding during pretraining yields up to 30.2% higher citation precision than passive identifier appending on CitePretrainBench for Qwen models.

Constructing Evaluation Datasets for Procedural Reasoning: Balancing Naturalness, Grounding, and Multi-Hop Coverage

cs.AI · 2026-06-11 · unverdicted · novelty 4.0

Strict generation directly from Task-Method-Knowledge models yields 96.5% grounded and 92.6% usable QA pairs across 23 topics, outperforming transcript-first and TMK-aware alternatives on representational grounding.

Explicit Evidence Grounding via Structured Inline Citation Generation

cs.CL · 2026-06-05 · unverdicted · novelty 4.0

FullCite introduces three strategies for structured inline citation generation in QA and finds LLMs identify relevant documents well but struggle with precise evidence spans on ASQA, BioASQ, and ExpertQA.

Traceable by Design: An LLM Pipeline and Dashboard for EU Regulatory Consultation Analysis

cs.CY · 2026-05-29 · unverdicted · novelty 4.0

An LLM pipeline with verbatim grounding processes 4,322 Digital Fairness Act submissions to produce 15,368 topic annotations and an interactive dashboard for traceable analysis.

Plans for Evaluating Structured Generative Search Summaries

cs.IR · 2026-05-26 · unverdicted · novelty 3.0

The authors propose an evaluation framework for LLM-generated structured search summaries and describe plans for implementing and testing it.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models cs.AI · 2025-06-21 · conditional · none · ref 1
Active Indexing with synthetic data augmentation for bidirectional fact-source binding during pretraining yields up to 30.2% higher citation precision than passive identifier appending on CitePretrainBench for Qwen models.

S., Turc, I., and Reitter, D

fields

years

verdicts

representative citing papers

citing papers explorer