CoRRabs/2411.08275(2024)

Shivani Upadhyay, Ronak Pradeep, Nandan Thakur, Daniel Campos, Nick Craswell, Ian Soboroff, Hoa Trang Dang, Jimmy Lin · 2024 · arXiv 2411.08275

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

Hybrid Pooling with LLMs via Relevance Context Learning

cs.IR · 2026-02-09 · unverdicted · novelty 7.0

Relevance Context Learning generates explicit relevance narratives from judged examples to guide LLM assessors, outperforming zero-shot and standard in-context learning for IR relevance judgments.

DoGMaTiQ: Automated Generation of Question-and-Answer Nuggets for Report Evaluation

cs.CL · 2026-05-06 · unverdicted · novelty 6.0

DoGMaTiQ automates QA-nugget creation via document-grounded generation, paraphrase clustering, and quality-based subselection, yielding strong rank correlations with human judgments on cross-lingual TREC tasks.

Formalized Information Needs Improve Large-Language-Model Relevance Judgments

cs.IR · 2026-04-05 · conditional · novelty 6.0

Synthetically formalizing information needs into topics with descriptions and narratives improves LLM relevance assessor agreement with humans and reduces over-labeling of relevant documents on TREC Deep Learning and Robust04.

When LLM Judges Inflate Scores: Exploring Overrating in Relevance Assessment

cs.IR · 2026-02-19 · unverdicted · novelty 6.0

LLMs consistently overrate relevance of inadequate passages in IR evaluations due to biases toward length and lexical features rather than true content match.

citing papers explorer

Showing 4 of 4 citing papers.

Hybrid Pooling with LLMs via Relevance Context Learning cs.IR · 2026-02-09 · unverdicted · none · ref 41
Relevance Context Learning generates explicit relevance narratives from judged examples to guide LLM assessors, outperforming zero-shot and standard in-context learning for IR relevance judgments.
DoGMaTiQ: Automated Generation of Question-and-Answer Nuggets for Report Evaluation cs.CL · 2026-05-06 · unverdicted · none · ref 30
DoGMaTiQ automates QA-nugget creation via document-grounded generation, paraphrase clustering, and quality-based subselection, yielding strong rank correlations with human judgments on cross-lingual TREC tasks.
Formalized Information Needs Improve Large-Language-Model Relevance Judgments cs.IR · 2026-04-05 · conditional · none · ref 41
Synthetically formalizing information needs into topics with descriptions and narratives improves LLM relevance assessor agreement with humans and reduces over-labeling of relevant documents on TREC Deep Learning and Robust04.
When LLM Judges Inflate Scores: Exploring Overrating in Relevance Assessment cs.IR · 2026-02-19 · unverdicted · none · ref 26
LLMs consistently overrate relevance of inadequate passages in IR evaluations due to biases toward length and lexical features rather than true content match.

CoRRabs/2411.08275(2024)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer