Ensemble of llm-retriever for accurate document ranking

Chia-Yuan Chang, Zhimeng Jiang, Vineeth Rakesh, Menghai Pan, Chin-Chia Michael Yeh, Guanchu Wang, Mingzhi Hu, Zhichao Xu, Yan Zheng, Mahashweta Das, et al · 2024 · arXiv 2501.00332

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

representative citing papers

XNote: Benchmarking Automated Community Notes Generation for Image-based Contextual Deception

cs.CL · 2026-03-23 · unverdicted · novelty 7.0

The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.

$S^3$-R1: Learning to Retrieve and Answer Step-by-Step with Synthetic Data

cs.LG · 2026-05-02 · unverdicted · novelty 6.0

S^3-R1 generates synthetic intermediate-difficulty multi-hop questions and applies dense rewards for search quality plus answer correctness, yielding up to 10% better out-of-domain generalization than baselines.

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

cs.CL · 2025-11-25 · unverdicted · novelty 6.0

Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.

MARS-SQL: A multi-agent reinforcement learning framework for Text-to-SQL

cs.CL · 2025-11-02 · unverdicted · novelty 5.0

MARS-SQL trains a multi-agent RL system with ReAct-style interaction and generative validation to produce SQL queries, reaching 77.84% execution accuracy on BIRD dev and 89.75% on Spider test.

citing papers explorer

Showing 4 of 4 citing papers.

XNote: Benchmarking Automated Community Notes Generation for Image-based Contextual Deception cs.CL · 2026-03-23 · unverdicted · none · ref 10
The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.
$S^3$-R1: Learning to Retrieve and Answer Step-by-Step with Synthetic Data cs.LG · 2026-05-02 · unverdicted · none · ref 13
S^3-R1 generates synthetic intermediate-difficulty multi-hop questions and applies dense rewards for search quality plus answer correctness, yielding up to 10% better out-of-domain generalization than baselines.
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory cs.CL · 2025-11-25 · unverdicted · none · ref 197
Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.
MARS-SQL: A multi-agent reinforcement learning framework for Text-to-SQL cs.CL · 2025-11-02 · unverdicted · none · ref 1
MARS-SQL trains a multi-agent RL system with ReAct-style interaction and generative validation to produce SQL queries, reaching 77.84% execution accuracy on BIRD dev and 89.75% on Spider test.

Ensemble of llm-retriever for accurate document ranking

fields

years

verdicts

representative citing papers

citing papers explorer