Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Bang Liu , Xinfeng Li , Jiayi Zhang , Jinlin Wang , Tanjin He , Sirui Hong , Hongzhang Liu , Shaokun Zhang

show 40 more authors

Kaitao Song Kunlun Zhu Yuheng Cheng Suyuchen Wang Xiaoqiang Wang Yuyu Luo Haibo Jin Peiyan Zhang Ollie Liu Jiaqi Chen Huan Zhang Zhaoyang Yu Haochen Shi Boyan Li Dekun Wu Fengwei Teng Xiaojun Jia Jiawei Xu Jinyu Xiang Yizhang Lin Tianming Liu Tongliang Liu Yu Su Huan Sun Glen Berseth Jianyun Nie Ian Foster Logan Ward Qingyun Wu Yu Gu Mingchen Zhuge Xinbing Liang Xiangru Tang Haohan Wang Jiaxuan You Chi Wang Jian Pei Qiang Yang Xiaoliang Qi Chenglin Wu

Authors on Pith no claims yet

classification 💻 cs.AI

keywords agentschallengesintelligenceintelligentmodularresearchsystemsarchitectures

0 comments

read the original abstract

The advent of large language models (LLMs) has catalyzed a transformative shift in artificial intelligence, paving the way for advanced intelligent agents capable of sophisticated reasoning, robust perception, and versatile action across diverse domains. As these agents increasingly drive AI research and practical applications, their design, evaluation, and continuous improvement present intricate, multifaceted challenges. This book provides a comprehensive overview, framing intelligent agents within modular, brain-inspired architectures that integrate principles from cognitive science, neuroscience, and computational research. We structure our exploration into four interconnected parts. First, we systematically investigate the modular foundation of intelligent agents, systematically mapping their cognitive, perceptual, and operational modules onto analogous human brain functionalities and elucidating core components such as memory, world modeling, reward processing, goal, and emotion. Second, we discuss self-enhancement and adaptive evolution mechanisms, exploring how agents autonomously refine their capabilities, adapt to dynamic environments, and achieve continual learning through automated optimization paradigms. Third, we examine multi-agent systems, investigating the collective intelligence emerging from agent interactions, cooperation, and societal structures. Finally, we address the critical imperative of building safe and beneficial AI systems, emphasizing intrinsic and extrinsic security threats, ethical alignment, robustness, and practical mitigation strategies necessary for trustworthy real-world deployment. By synthesizing modular AI architectures with insights from different disciplines, this survey identifies key research challenges and opportunities, encouraging innovations that harmonize technological advancement with meaningful societal benefit.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Harnessing Agentic Evolution
cs.AI 2026-05 unverdicted novelty 7.0

AEvo introduces a meta-agent that edits the evolution procedure or agent context based on accumulated state, outperforming baselines by 26% relative improvement on agentic benchmarks and achieving SOTA on open-ended tasks.
AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems
cs.CL 2026-05 unverdicted novelty 7.0

AgentForesight introduces an online auditor model that predicts decisive errors in multi-agent trajectories at the earliest step using a coarse-to-fine reinforcement learning recipe on a new curated dataset AFTraj-2K.
AgentForesight: Online Auditing for Early Failure Prediction in Multi-Agent Systems
cs.CL 2026-05 unverdicted novelty 7.0

AgentForesight trains a 7B model to perform online auditing of multi-agent LLM trajectories, detecting early decisive errors and outperforming larger models on custom and external benchmarks.
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost
cs.AI 2026-05 conditional novelty 7.0

Post-Reasoning boosts LLM accuracy by reversing the usual answer-after-reasoning order, delivering mean relative gains of 17.37% across 117 model-benchmark pairs with zero extra cost.
ReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation
cs.LG 2026-04 unverdicted novelty 7.0

ReCast repairs all-zero groups and uses contrastive updates on strongest positives and hardest negatives to improve RL in generative recommendation, yielding up to 36.6% better Pass@1 with only 4.1% of baseline rollou...
FuzzAgent: Multi-Agent System for Evolutionary Library Fuzzing
cs.SE 2026-05 conditional novelty 6.0

FuzzAgent deploys specialized agents that collaborate on harness generation, execution, and crash triage to evolve fuzzing campaigns, delivering 45-191% more branch coverage than four baselines on 20 C/C++ libraries a...
CHAL: Council of Hierarchical Agentic Language
cs.AI 2026-05 unverdicted novelty 6.0

CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.
MemORAI: Memory Organization and Retrieval via Adaptive Graph Intelligence for LLM Conversational Agents
cs.CL 2026-05 unverdicted novelty 6.0

MemORAI combines selective filtering, provenance tracking in multi-relational graphs, and dynamic weighted PageRank retrieval to achieve state-of-the-art memory retrieval and personalized responses in LLM agents on LO...
Learning to Communicate: Toward End-to-End Optimization of Multi-Agent Language Systems
cs.AI 2026-04 unverdicted novelty 6.0

DiffMAS jointly optimizes latent communication and reasoning in multi-agent LLM systems via parameter-efficient supervised training on trajectories, yielding consistent gains over baselines on math, science, and code ...
Do LLMs Need to See Everything? A Benchmark and Study of Failures in LLM-driven Smartphone Automation using Screentext vs. Screenshots
cs.HC 2026-04 unverdicted novelty 6.0

A new benchmark shows LLM smartphone agents achieve comparable success with screen text alone as with screenshots, but both fail often due to UI accessibility and reasoning gaps.
Agent-GWO: Collaborative Agents for Dynamic Prompt Optimization in Large Language Models
cs.NE 2026-04 unverdicted novelty 6.0

Agent-GWO uses collaborative grey-wolf-inspired agents to jointly optimize LLM prompts and decoding settings, yielding higher accuracy and stability than prior single-agent prompt optimization methods on math and hybr...
ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying
cs.CR 2026-04 unverdicted novelty 6.0

ADAM extracts data from LLM agent memory with up to 100% attack success rate by estimating data distribution and selecting queries via entropy guidance.
Memory in the Age of AI Agents
cs.CL 2025-12 unverdicted novelty 6.0

The paper maps agent memory research via three forms (token-level, parametric, latent), three functions (factual, experiential, working), and dynamics of formation/evolution/retrieval, plus benchmarks and future directions.
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems
cs.AI 2026-05 conditional novelty 5.0

The survey proposes the LIFE framework to unify fragmented research on collaboration, failure attribution, and self-evolution in LLM multi-agent systems into a progression toward self-organizing intelligence.
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
cs.AI 2025-09 conditional novelty 5.0

UI-TARS-2 reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld while attaining 59.8 mean normalized score on a 15-game suite through multi-turn RL and scalable data generation.
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
cs.AI 2025-08 unverdicted novelty 5.0

A comprehensive review of self-evolving AI agents that improve themselves over time, organized via a framework of inputs, agent system, environment, and optimizers, with domain-specific and safety discussions.
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
cs.CL 2025-03 accept novelty 5.0

A survey organizing techniques to achieve efficient reasoning in LLMs by shortening chain-of-thought outputs.
Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures
cs.AI 2026-04 unverdicted novelty 4.0

A survey comparing classical multi-agent systems with large foundation model-enabled multi-agent systems, showing how the latter enables semantic-level collaboration and greater adaptability.
Perspective on Bias in Biomedical AI: Preventing Downstream Healthcare Disparities
cs.AI 2026-04 conditional novelty 4.0

Omics datasets show low ancestry reporting and strong European bias, which biomedical foundation models risk perpetuating into downstream healthcare disparities unless addressed through provenance, openness, and evalu...