hub

Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions

Kalervo Järvelin, Jaana Kekäläinen · 2002 · arXiv 2415.582418

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks

cs.LG · 2026-05-02 · unverdicted · novelty 7.0

EstGraph benchmark evaluates LLMs on estimating properties of very large graphs from random-walk samples that fit in context limits.

TubiFM: Unified Item, Carousel, and Search Ranking for Streaming Discovery

cs.IR · 2026-05-22 · unverdicted · novelty 6.0

A Llama-based model trained on serialized user stories unifies item, carousel, and search ranking and outperforms specialist baselines offline while improving some online metrics and reducing latency.

Improving BM25 Code Retrieval Under Fixed Generic Tokenization: Adaptive q-Log Odds as a Drop-In BM25 Fix

cs.IR · 2026-05-18 · unverdicted · novelty 6.0

A q-log odds variant of BM25 raises NDCG@10 by 89% relative on CodeSearchNet Go under fixed generic tokenization while recovering standard BM25 at q=1.

MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval

cs.IR · 2026-05-11 · unverdicted · novelty 6.0

MIRA is a new benchmark for multi-category integrated retrieval built from real queries on a social science platform, with LLM assistance for topic descriptions and relevance labeling across four item categories.

JU\'A -- A Benchmark for Information Retrieval in Brazilian Legal Text Collections

cs.IR · 2026-04-07 · accept · novelty 6.0

JU'A is a new heterogeneous benchmark for Brazilian legal IR that distinguishes retrieval methods and shows domain-adapted models excel on aligned subsets while BM25 stays competitive elsewhere.

Predicting New Concept-Object Associations in Astronomy by Mining the Literature

astro-ph.IM · 2026-02-15 · unverdicted · novelty 6.0

Matrix factorization on a literature-mined concept-object graph predicts future associations in astronomy better than neighborhood similarity or recency heuristics.

ECPO: Evidence-Coupled Policy Optimization for Evidence-Certified Candidate Ranking

cs.AI · 2026-05-21 · unverdicted · novelty 5.0

ECPO is a listwise policy optimization method that couples ranking utility with span-level evidence certificate validity and a deterministic verifier reward on MAVEN-ERE and RAMS datasets.

Unsupervised Domain Shift Detection with Interpretable Subspace Attribution

stat.ML · 2026-05-15 · unverdicted · novelty 5.0

An unsupervised method detects domain shifts via localized density anomaly search in feature space, attributes the shift to a minimal subspace, and extracts balanced subsets from two unlabeled datasets.

Governing AI-Assisted Security Operations: A Design Science Framework for Operational Decision Support

cs.CR · 2026-05-10 · unverdicted · novelty 5.0

The paper develops a design science framework for governing AI-assisted operational decision support in security operations centers by specifying a query-broker artifact that separates AI planning from execution through approved templates, policy validation, and engineering review gates.

Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

cs.IR · 2026-03-09 · unverdicted · novelty 5.0

Coverage-focused retrieval metrics correlate strongly with nugget coverage in RAG responses across text and multimodal benchmarks, supporting their use as performance proxies when retrieval and generation goals align.

User Simulation for Evaluating Information Access Systems

cs.HC · 2023-06-14 · unverdicted · novelty 2.0

A systematic review of user simulation frameworks, models, and applications for evaluating information access systems.

IDRBench: Understanding the Capability of Large Language Models on Interdisciplinary Research

cs.CL · 2025-07-21

citing papers explorer

Showing 12 of 12 citing papers.

Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks cs.LG · 2026-05-02 · unverdicted · none · ref 86
EstGraph benchmark evaluates LLMs on estimating properties of very large graphs from random-walk samples that fit in context limits.
TubiFM: Unified Item, Carousel, and Search Ranking for Streaming Discovery cs.IR · 2026-05-22 · unverdicted · none · ref 10
A Llama-based model trained on serialized user stories unifies item, carousel, and search ranking and outperforms specialist baselines offline while improving some online metrics and reducing latency.
Improving BM25 Code Retrieval Under Fixed Generic Tokenization: Adaptive q-Log Odds as a Drop-In BM25 Fix cs.IR · 2026-05-18 · unverdicted · none · ref 7
A q-log odds variant of BM25 raises NDCG@10 by 89% relative on CodeSearchNet Go under fixed generic tokenization while recovering standard BM25 at q=1.
MIRA: An LLM-Assisted Benchmark for Multi-Category Integrated Retrieval cs.IR · 2026-05-11 · unverdicted · none · ref 30
MIRA is a new benchmark for multi-category integrated retrieval built from real queries on a social science platform, with LLM assistance for topic descriptions and relevance labeling across four item categories.
JU\'A -- A Benchmark for Information Retrieval in Brazilian Legal Text Collections cs.IR · 2026-04-07 · accept · none · ref 25
JU'A is a new heterogeneous benchmark for Brazilian legal IR that distinguishes retrieval methods and shows domain-adapted models excel on aligned subsets while BM25 stays competitive elsewhere.
Predicting New Concept-Object Associations in Astronomy by Mining the Literature astro-ph.IM · 2026-02-15 · unverdicted · none · ref 12
Matrix factorization on a literature-mined concept-object graph predicts future associations in astronomy better than neighborhood similarity or recency heuristics.
ECPO: Evidence-Coupled Policy Optimization for Evidence-Certified Candidate Ranking cs.AI · 2026-05-21 · unverdicted · none · ref 7
ECPO is a listwise policy optimization method that couples ranking utility with span-level evidence certificate validity and a deterministic verifier reward on MAVEN-ERE and RAMS datasets.
Unsupervised Domain Shift Detection with Interpretable Subspace Attribution stat.ML · 2026-05-15 · unverdicted · none · ref 15
An unsupervised method detects domain shifts via localized density anomaly search in feature space, attributes the shift to a minimal subspace, and extracts balanced subsets from two unlabeled datasets.
Governing AI-Assisted Security Operations: A Design Science Framework for Operational Decision Support cs.CR · 2026-05-10 · unverdicted · none · ref 36
The paper develops a design science framework for governing AI-assisted operational decision support in security operations centers by specifying a query-broker artifact that separates AI planning from execution through approved templates, policy validation, and engineering review gates.
Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage cs.IR · 2026-03-09 · unverdicted · none · ref 26
Coverage-focused retrieval metrics correlate strongly with nugget coverage in RAG responses across text and multimodal benchmarks, supporting their use as performance proxies when retrieval and generation goals align.
User Simulation for Evaluating Information Access Systems cs.HC · 2023-06-14 · unverdicted · none · ref 7
A systematic review of user simulation frameworks, models, and applications for evaluating information access systems.
IDRBench: Understanding the Capability of Large Language Models on Interdisciplinary Research cs.CL · 2025-07-21 · unreviewed · ref 14

Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer