Transactions of the Association for Computational Linguistics , volume =

Yannis Katsis, Sara Rosenthal, Kshitij Fadnis, Chulaka Gunasekara, Young · 2025 · DOI 10.1162/tacl.a.19

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

TwinRouterBench: Fast Static and Live Dynamic Evaluation for Realistic Agentic LLM Routing

cs.LG · 2026-05-14 · accept · novelty 7.0 · 2 refs

TwinRouterBench supplies 970 execution-verified router prefixes across five datasets plus a live harness for 100 held-out SWE-bench cases, scoring routers on tier accuracy, trajectory success, and realized token cost without LLM judges.

Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking

cs.CL · 2026-05-12 · unverdicted · novelty 4.0

A pipeline with LoRA-fine-tuned query rewriting, BM25+dense hybrid retrieval via RRF, and cross-encoder reranking reaches nDCG@5 of 0.531 on multi-turn retrieval across four domains.

H-RAG at SemEval-2026 Task 8: Hierarchical Parent-Child Retrieval for Multi-Turn RAG Conversations

cs.CL · 2026-05-01 · unverdicted · novelty 4.0

H-RAG uses hierarchical parent-child document segmentation with hybrid retrieval and parent-level aggregation to achieve 0.4271 nDCG@5 on retrieval and 0.3241 harmonic mean on generation in a multi-turn RAG shared task.

citing papers explorer

Showing 3 of 3 citing papers.

TwinRouterBench: Fast Static and Live Dynamic Evaluation for Realistic Agentic LLM Routing cs.LG · 2026-05-14 · accept · none · ref 9 · 2 links
TwinRouterBench supplies 970 execution-verified router prefixes across five datasets plus a live harness for 100 held-out SWE-bench cases, scoring routers on tier accuracy, trajectory success, and realized token cost without LLM judges.
Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking cs.CL · 2026-05-12 · unverdicted · none · ref 1
A pipeline with LoRA-fine-tuned query rewriting, BM25+dense hybrid retrieval via RRF, and cross-encoder reranking reaches nDCG@5 of 0.531 on multi-turn retrieval across four domains.
H-RAG at SemEval-2026 Task 8: Hierarchical Parent-Child Retrieval for Multi-Turn RAG Conversations cs.CL · 2026-05-01 · unverdicted · none · ref 3
H-RAG uses hierarchical parent-child document segmentation with hybrid retrieval and parent-level aggregation to achieve 0.4271 nDCG@5 on retrieval and 0.3241 harmonic mean on generation in a multi-turn RAG shared task.

Transactions of the Association for Computational Linguistics , volume =

fields

years

verdicts

representative citing papers

citing papers explorer