MARS: Modular Agent with Reflective Search for Automated AI Research

Jiefeng Chen, Bhavana Dalvi Mishra, Jaehyun Nam, Rui Meng, Tomas Pfister, Jinsung Yoon · 2026 · arXiv 2602.02660

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open full Pith review browse 7 citing papers arXiv PDF

representative citing papers

DrugSAGE:Self-evolving Agent Experience for Efficient State-of-the-Art Drug Discovery

cs.LG · 2026-05-14 · unverdicted · novelty 6.0

DrugSAGE accumulates cross-task memory of skills, statistical evidence, and recurring errors to let LLM agents achieve top-ranked performance on molecular property prediction tasks with reduced or zero test-time search.

Revisiting DAgger in the Era of LLM-Agents

cs.LG · 2026-05-13 · conditional · novelty 6.0

DAgger-style training with turn-level policy interpolation raises 4B and 8B LLM agents to 27.3% and 29.8% on SWE-bench Verified, beating several larger published systems.

AIBuildAI: An AI Agent for Automatically Building AI Models

cs.AI · 2026-04-15 · unverdicted · novelty 6.0

AIBuildAI uses a manager agent and three LLM sub-agents to fully automate AI model development and achieves a 63.1% medal rate on MLE-Bench, matching experienced human engineers.

Agentic Discovery with Active Hypothesis Exploration for Visual Recognition

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

HypoExplore uses LLMs for hypothesis-driven evolutionary search with a Trajectory Tree and Hypothesis Memory Bank to discover lightweight vision architectures, reaching 94.11% accuracy on CIFAR-10 from an 18.91% baseline and generalizing to other datasets including state-of-the-art on MedMNIST.

AIRA_2: Overcoming Bottlenecks in AI Research Agents

cs.AI · 2026-03-27 · conditional · novelty 6.0

AIRA₂ improves AI research agents via asynchronous multi-GPU workers, hidden consistent evaluation, and interactive ReAct agents, reaching 81.5-83.1% percentile rank on MLE-bench-30 and exceeding human SOTA on 6 of 20 AIRS-Bench tasks.

FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics

cs.LG · 2026-05-17

Toward Autonomous Long-Horizon Engineering for ML Research

cs.CL · 2026-04-14

citing papers explorer

Showing 7 of 7 citing papers.

DrugSAGE:Self-evolving Agent Experience for Efficient State-of-the-Art Drug Discovery cs.LG · 2026-05-14 · unverdicted · none · ref 15 · internal anchor
DrugSAGE accumulates cross-task memory of skills, statistical evidence, and recurring errors to let LLM agents achieve top-ranked performance on molecular property prediction tasks with reduced or zero test-time search.
Revisiting DAgger in the Era of LLM-Agents cs.LG · 2026-05-13 · conditional · none · ref 7 · internal anchor
DAgger-style training with turn-level policy interpolation raises 4B and 8B LLM agents to 27.3% and 29.8% on SWE-bench Verified, beating several larger published systems.
AIBuildAI: An AI Agent for Automatically Building AI Models cs.AI · 2026-04-15 · unverdicted · none · ref 25 · internal anchor
AIBuildAI uses a manager agent and three LLM sub-agents to fully automate AI model development and achieves a 63.1% medal rate on MLE-Bench, matching experienced human engineers.
Agentic Discovery with Active Hypothesis Exploration for Visual Recognition cs.CV · 2026-04-14 · unverdicted · none · ref 8 · internal anchor
HypoExplore uses LLMs for hypothesis-driven evolutionary search with a Trajectory Tree and Hypothesis Memory Bank to discover lightweight vision architectures, reaching 94.11% accuracy on CIFAR-10 from an 18.91% baseline and generalizing to other datasets including state-of-the-art on MedMNIST.
AIRA_2: Overcoming Bottlenecks in AI Research Agents cs.AI · 2026-03-27 · conditional · none · ref 3 · internal anchor
AIRA₂ improves AI research agents via asynchronous multi-GPU workers, hidden consistent evaluation, and interactive ReAct agents, reaching 81.5-83.1% percentile rank on MLE-bench-30 and exceeding human SOTA on 6 of 20 AIRS-Bench tasks.
FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics cs.LG · 2026-05-17 · unreviewed · ref 10 · internal anchor
Toward Autonomous Long-Horizon Engineering for ML Research cs.CL · 2026-04-14 · unreviewed · ref 3 · internal anchor

MARS: Modular Agent with Reflective Search for Automated AI Research

fields

years

verdicts

representative citing papers

citing papers explorer