Transactions of the Association for Computational Linguistics , volume =

Katsis, Yannis, Rosenthal, Sara, Fadnis, Kshitij, Gunasekara, Chulaka, Lee, Young-Suk, Popa, Lucian · 2025 · DOI 10.1162/tacl.a.19

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open at publisher browse 5 citing papers

representative citing papers

TwinRouterBench: Fast Static and Live Dynamic Evaluation for Realistic Agentic LLM Routing

cs.LG · 2026-05-14 · accept · novelty 7.0 · 2 refs

TwinRouterBench supplies 970 execution-verified router prefixes across five datasets plus a live harness for 100 held-out SWE-bench cases, scoring routers on tier accuracy, trajectory success, and realized token cost without LLM judges.

Caraman at SemEval-2026 Task 8: Three-Stage Multi-Turn Retrieval with Query Rewriting, Hybrid Search, and Cross-Encoder Reranking

cs.CL · 2026-05-12 · unverdicted · novelty 4.0

A pipeline with LoRA-fine-tuned query rewriting, BM25+dense hybrid retrieval via RRF, and cross-encoder reranking reaches nDCG@5 of 0.531 on multi-turn retrieval across four domains.

H-RAG at SemEval-2026 Task 8: Hierarchical Parent-Child Retrieval for Multi-Turn RAG Conversations

cs.CL · 2026-05-01 · unverdicted · novelty 4.0

H-RAG uses hierarchical parent-child document segmentation with hybrid retrieval and parent-level aggregation to achieve 0.4271 nDCG@5 on retrieval and 0.3241 harmonic mean on generation in a multi-turn RAG shared task.

Sifei at SemEval-2026 Task 8: Hybrid Retrieval and Query Rewriting for Multi-Turn RAG

cs.IR · 2026-06-05 · unverdicted · novelty 3.0

A hybrid dense-sparse retrieval pipeline with query rewriting and cross-encoder reranking achieves 0.5453 nDCG@5 (third place) on SemEval-2026 Task 8 Task A and 0.5312 harmonic mean on Task C.

5ting at SemEval-2026 Task 8: Strong End-to-End Multi-Turn RAG via LLM-Based Reranking and Faithfulness Control

cs.CL · 2026-06-27 · unverdicted · novelty 2.0

5ting achieves nDCG@5 of 0.4719 on Task A and harmonic score 0.5597 with RL_F 0.7692 on Task C for multi-turn RAG via standard dense retrieval plus LLM reranking and faithfulness constraints.

citing papers explorer

Showing 1 of 1 citing paper after filters.

TwinRouterBench: Fast Static and Live Dynamic Evaluation for Realistic Agentic LLM Routing cs.LG · 2026-05-14 · accept · none · ref 9 · 2 links
TwinRouterBench supplies 970 execution-verified router prefixes across five datasets plus a live harness for 100 held-out SWE-bench cases, scoring routers on tier accuracy, trajectory success, and realized token cost without LLM judges.

Transactions of the Association for Computational Linguistics , volume =

fields

years

verdicts

representative citing papers

citing papers explorer