In- fodeepseek: Benchmarking agentic information seeking for retrieval-augmented generation

· 2025 · arXiv 2505.15872

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery

cs.AI · 2026-04-28 · accept · novelty 8.0

AutoResearchBench is a new benchmark showing top AI agents achieve under 10% success on complex scientific literature discovery tasks that demand deep comprehension and open-ended search.

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

cs.CL · 2026-06-10 · unverdicted · novelty 6.0

Arbor combines a coordinator, executors, and a hypothesis tree to enable cumulative autonomous research, outperforming Codex and Claude Code by over 2.5x on six real tasks and reaching 86.36% Any Medal on MLE-Bench Lite.

Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations

cs.IR · 2026-04-20 · unverdicted · novelty 6.0

LLMs exhibit mid-layer representation advantage for recommendations; MARC compresses representations modularly to reduce costs while improving performance, as shown in a large-scale online advertising deployment.

Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application

cs.CL · 2026-06-10 · unverdicted · novelty 5.0

This survey categorizes agentic environments for LLMs by eight attributes and domains, introduces symbolic and neural synthesis paradigms with evaluation, and outlines four agent evolution pathways plus three environment evolution paradigms.

Toward Agentic RAG for Ukrainian

cs.AI · 2026-04-16 · unverdicted · novelty 3.0

Agentic RAG for Ukrainian improves answer accuracy via retries but is still limited by document and page retrieval quality.

citing papers explorer

Showing 1 of 1 citing paper after filters.

AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery cs.AI · 2026-04-28 · accept · none · ref 48
AutoResearchBench is a new benchmark showing top AI agents achieve under 10% success on complex scientific literature discovery tasks that demand deep comprehension and open-ended search.

In- fodeepseek: Benchmarking agentic information seeking for retrieval-augmented generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer