HealthAgentBench is a new benchmark of 54 healthcare agent tasks where even the strongest frontier AI agent reaches only about 42% success rate on end-to-end clinical workflows.
Self-evolving multi-agent simulations for realistic clinical interactions.arXiv preprint arXiv:2503.22678
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
The paper maps agent memory research via three forms (token-level, parametric, latent), three functions (factual, experiential, working), and dynamics of formation/evolution/retrieval, plus benchmarks and future directions.
A comprehensive review of self-evolving AI agents that improve themselves over time, organized via a framework of inputs, agent system, environment, and optimizers, with domain-specific and safety discussions.
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
citing papers explorer
No citing papers match the current filters.