Replug: Retrieval-augmented black-box language models

Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Richard James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih · 2024 · DOI 10.18653/v1/2024.naacl-long.463

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

open at publisher browse 8 citing papers

representative citing papers

From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning

cs.AI · 2026-04-21 · unverdicted · novelty 7.0

MAGEO is a multi-agent system that distills validated editing patterns into reusable optimization skills for generative engines, outperforming heuristic baselines on visibility and fidelity via a new benchmark and evaluation protocol.

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

cs.CL · 2024-10-14 · unverdicted · novelty 7.0

LongMemEval benchmarks long-term memory in chat assistants, revealing 30% accuracy drops across sustained interactions and proposing indexing-retrieval-reading optimizations that boost performance.

Boosting Self-Consistency with Ranking

cs.CL · 2026-06-03 · unverdicted · novelty 6.0

RISC reformulates self-consistency answer selection as a ranking task solved by a lightweight LambdaRank model with five hand-designed features, yielding better accuracy-efficiency trade-offs than majority voting on QA benchmarks.

R$^3$AG: Retriever Routing for Retrieval-Augmented Generation

cs.IR · 2026-04-22 · unverdicted · novelty 6.0

R³AG routes queries to retrievers by decomposing capabilities into retrieval quality and generation utility, trained via contrastive learning on document assessments and downstream answer correctness to outperform static methods.

Procedural Knowledge at Scale Improves Reasoning

cs.CL · 2026-04-01 · unverdicted · novelty 6.0

Reasoning Memory decomposes reasoning trajectories into 32 million subquestion-subroutine pairs and retrieves them via in-thought prompts to improve language model performance on math, science, and coding benchmarks by up to 19.2%.

ARMOR: Adaptive Retriever Optimization for Low-Resource Telecom Question Answering

cs.IR · 2026-06-29 · unverdicted · novelty 5.0

ARMOR optimizes retrievers via joint RAG-likelihood and InfoNCE training with regularization toward the base encoder, yielding improved retrieval and QA on telecom benchmarks.

Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation

cs.CL · 2026-04-19 · unverdicted · novelty 5.0

QREAM rewrites documents to question-focused style using iterative ICL and distilled FT models, boosting RAG performance by up to 8% relative improvement.

Less LLM, More Documents: Searching for Improved RAG

cs.IR · 2025-10-03 · unverdicted · novelty 4.0

Corpus scaling in RAG frequently matches the accuracy gains from larger LLMs on open-domain QA tasks, with mid-sized models benefiting most due to better passage coverage.

citing papers explorer

Showing 8 of 8 citing papers after filters.

From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning cs.AI · 2026-04-21 · unverdicted · none · ref 79
MAGEO is a multi-agent system that distills validated editing patterns into reusable optimization skills for generative engines, outperforming heuristic baselines on visibility and fidelity via a new benchmark and evaluation protocol.
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory cs.CL · 2024-10-14 · unverdicted · none · ref 91
LongMemEval benchmarks long-term memory in chat assistants, revealing 30% accuracy drops across sustained interactions and proposing indexing-retrieval-reading optimizations that boost performance.
Boosting Self-Consistency with Ranking cs.CL · 2026-06-03 · unverdicted · none · ref 176
RISC reformulates self-consistency answer selection as a ranking task solved by a lightweight LambdaRank model with five hand-designed features, yielding better accuracy-efficiency trade-offs than majority voting on QA benchmarks.
R$^3$AG: Retriever Routing for Retrieval-Augmented Generation cs.IR · 2026-04-22 · unverdicted · none · ref 49
R³AG routes queries to retrievers by decomposing capabilities into retrieval quality and generation utility, trained via contrastive learning on document assessments and downstream answer correctness to outperform static methods.
Procedural Knowledge at Scale Improves Reasoning cs.CL · 2026-04-01 · unverdicted · none · ref 34
Reasoning Memory decomposes reasoning trajectories into 32 million subquestion-subroutine pairs and retrieves them via in-thought prompts to improve language model performance on math, science, and coding benchmarks by up to 19.2%.
ARMOR: Adaptive Retriever Optimization for Low-Resource Telecom Question Answering cs.IR · 2026-06-29 · unverdicted · none · ref 34
ARMOR optimizes retrievers via joint RAG-likelihood and InfoNCE training with regularization toward the base encoder, yielding improved retrieval and QA on telecom benchmarks.
Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation cs.CL · 2026-04-19 · unverdicted · none · ref 134
QREAM rewrites documents to question-focused style using iterative ICL and distilled FT models, boosting RAG performance by up to 8% relative improvement.
Less LLM, More Documents: Searching for Improved RAG cs.IR · 2025-10-03 · unverdicted · none · ref 26
Corpus scaling in RAG frequently matches the accuracy gains from larger LLMs on open-domain QA tasks, with mid-sized models benefiting most due to better passage coverage.

Replug: Retrieval-augmented black-box language models

fields

years

verdicts

representative citing papers

citing papers explorer