DoGMaTiQ automates QA-nugget creation via document-grounded generation, paraphrase clustering, and quality-based subselection, yielding strong rank correlations with human judgments on cross-lingual TREC tasks.
Auto-ARGUE: LLM-Based Report Generation Evaluation
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
Generation of citation-backed reports is a primary use case for retrieval-augmented generation (RAG) systems. While open-source evaluation tools exist for various RAG tasks, tools designed for report generation are lacking. Accordingly, we introduce Auto-ARGUE, a robust LLM-based implementation of the recently proposed ARGUE framework for report generation evaluation. We present analysis of Auto-ARGUE on the report generation pilot task from the TREC 2024 NeuCLIR track and on two tasks from the TREC 2024 RAG track, showing good system-level correlations with human judgments. Additionally, we release ARGUE-Viz, a web app for visualization and fine-grained analysis of Auto-ARGUE judgments and scores.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
BloomBee is a distributed LLM inference system that achieves up to 1.76x higher throughput and 43.2% lower latency than prior decentralized systems by optimizing communication across multiple dimensions in low-bandwidth internet settings.
Coverage-focused retrieval metrics correlate strongly with nugget coverage in RAG responses across text and multimodal benchmarks, supporting their use as performance proxies when retrieval and generation goals align.
citing papers explorer
-
DoGMaTiQ: Automated Generation of Question-and-Answer Nuggets for Report Evaluation
DoGMaTiQ automates QA-nugget creation via document-grounded generation, paraphrase clustering, and quality-based subselection, yielding strong rank correlations with human judgments on cross-lingual TREC tasks.
-
Distributed Generative Inference of LLM at Internet Scales with Multi-Dimensional Communication Optimization
BloomBee is a distributed LLM inference system that achieves up to 1.76x higher throughput and 43.2% lower latency than prior decentralized systems by optimizing communication across multiple dimensions in low-bandwidth internet settings.
-
Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage
Coverage-focused retrieval metrics correlate strongly with nugget coverage in RAG responses across text and multimodal benchmarks, supporting their use as performance proxies when retrieval and generation goals align.