hub

Agentsquare: Automatic llm agent search in modular design space

Agentsquare: Automatic llm agent search in modular design space , author= · 2024 · arXiv 2410.06153

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents

cs.AI · 2025-12-14 · accept · novelty 8.0

MobiBench is the first modular multi-path offline benchmark for mobile GUI agents, achieving 94.72% agreement with human evaluators while allowing component-level analysis.

TacoMAS: Test-Time Co-Evolution of Topology and Capability in LLM-based Multi-Agent Systems

cs.CL · 2026-05-10 · unverdicted · novelty 7.0

TacoMAS performs test-time co-evolution of agent capabilities and communication topology in LLM multi-agent systems via fast capability updates and slow meta-LLM topology edits, delivering 13.3% average gains over strong baselines on four benchmarks.

Tools as Continuous Flow for Evolving Agentic Reasoning

cs.AI · 2026-05-08 · unverdicted · novelty 7.0

FlowAgent models tool chaining as continuous latent trajectory generation with conditional flow matching to deliver global planning, formal utility bounds, and better robustness on long-horizon tasks, plus a new plan-level benchmark.

AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

cs.AI · 2026-05-19 · unverdicted · novelty 6.0

AgentCo-op retrieves and assembles existing agents and tools into interoperable workflows for open-world scientific tasks, showing effectiveness in genomics case studies and competitive benchmark results with lower costs.

Harnesses for Inference-Time Alignment over Execution Trajectories

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

Partial harnesses for LLM agents, specifying only initial execution steps, achieve higher pass rates than fully decomposed workflows, as analyzed through trajectory alignment and validated in synthetic and terminal benchmarks.

Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

LLM agents trained with a task-success reward on self-generated knowledge can spontaneously explore and adapt to new environments without any rewards or instructions at inference, yielding 20% gains on web tasks and allowing a 14B model to beat Gemini-2.5-Flash.

AgentGA: Evolving Code Solutions in Agent-Seed Space

cs.AI · 2026-04-16 · unverdicted · novelty 6.0 · 2 refs

AgentGA optimizes agent seeds with genetic algorithms and parent-archive inheritance to improve autonomous code generation, beating a baseline on 15 of 16 Kaggle competitions.

Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models

cs.CL · 2025-10-09 · unverdicted · novelty 6.0

GTD generates task-adaptive, sparse communication topologies for multi-LLM agents via guided iterative graph diffusion steered by a proxy model predicting accuracy, utility, and cost.

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

cs.AI · 2025-01-16 · unverdicted · novelty 3.0

The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.

AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

cs.CL · 2026-04-07

citing papers explorer

Showing 10 of 10 citing papers.

MobiBench: Multi-Branch, Modular Benchmark for Mobile GUI Agents cs.AI · 2025-12-14 · accept · none · ref 26
MobiBench is the first modular multi-path offline benchmark for mobile GUI agents, achieving 94.72% agreement with human evaluators while allowing component-level analysis.
TacoMAS: Test-Time Co-Evolution of Topology and Capability in LLM-based Multi-Agent Systems cs.CL · 2026-05-10 · unverdicted · none · ref 8
TacoMAS performs test-time co-evolution of agent capabilities and communication topology in LLM multi-agent systems via fast capability updates and slow meta-LLM topology edits, delivering 13.3% average gains over strong baselines on four benchmarks.
Tools as Continuous Flow for Evolving Agentic Reasoning cs.AI · 2026-05-08 · unverdicted · none · ref 36
FlowAgent models tool chaining as continuous latent trajectory generation with conditional flow matching to deliver global planning, formal utility bounds, and better robustness on long-horizon tasks, plus a new plan-level benchmark.
AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows cs.AI · 2026-05-19 · unverdicted · none · ref 19
AgentCo-op retrieves and assembles existing agents and tools into interoperable workflows for open-world scientific tasks, showing effectiveness in genomics case studies and competitive benchmark results with lower costs.
Harnesses for Inference-Time Alignment over Execution Trajectories cs.LG · 2026-05-15 · unverdicted · none · ref 36
Partial harnesses for LLM agents, specifying only initial execution steps, achieve higher pass rates than fully decomposed workflows, as analyzed through trajectory alignment and validated in synthetic and terminal benchmarks.
Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration cs.AI · 2026-04-20 · unverdicted · none · ref 11
LLM agents trained with a task-success reward on self-generated knowledge can spontaneously explore and adapt to new environments without any rewards or instructions at inference, yielding 20% gains on web tasks and allowing a 14B model to beat Gemini-2.5-Flash.
AgentGA: Evolving Code Solutions in Agent-Seed Space cs.AI · 2026-04-16 · unverdicted · none · ref 21 · 2 links
AgentGA optimizes agent seeds with genetic algorithms and parent-archive inheritance to improve autonomous code generation, beating a baseline on 15 of 16 Kaggle competitions.
Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models cs.CL · 2025-10-09 · unverdicted · none · ref 23
GTD generates task-adaptive, sparse communication topologies for multi-LLM agents via guided iterative graph diffusion steered by a proxy model predicting accuracy, utility, and cost.
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models cs.AI · 2025-01-16 · unverdicted · none · ref 127
The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery cs.CL · 2026-04-07 · unreviewed · ref 16

Agentsquare: Automatic llm agent search in modular design space

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer