AutoResearchBench is a new benchmark showing top AI agents achieve under 10% success on complex scientific literature discovery tasks that demand deep comprehension and open-ended search.
In- fodeepseek: Benchmarking agentic information seeking for retrieval-augmented generation
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 2polarities
background 2representative citing papers
LLMs exhibit mid-layer representation advantage for recommendations; MARC compresses representations modularly to reduce costs while improving performance, as shown in a large-scale online advertising deployment.
Agentic RAG for Ukrainian improves answer accuracy via retries but is still limited by document and page retrieval quality.
citing papers explorer
-
AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery
AutoResearchBench is a new benchmark showing top AI agents achieve under 10% success on complex scientific literature discovery tasks that demand deep comprehension and open-ended search.
-
Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations
LLMs exhibit mid-layer representation advantage for recommendations; MARC compresses representations modularly to reduce costs while improving performance, as shown in a large-scale online advertising deployment.
-
Toward Agentic RAG for Ukrainian
Agentic RAG for Ukrainian improves answer accuracy via retries but is still limited by document and page retrieval quality.