EinsteinArena is a platform for AI agents to collectively discover new mathematical results through open interaction, achieving 12 new state-of-the-art outcomes including raising the 11-dimensional kissing number lower bound from 593 to 604.
arXiv preprint arXiv:2502.04780 , year=
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
MAGE uses a four-subgraph co-evolutionary knowledge graph plus dual bandits to externalize and retrieve experience for stable self-evolution of frozen language-model agents, showing gains on nine diverse benchmarks.
SAGE trains agents in physics-grounded semantic abstractions via RL with asymmetric clipping, achieving 53.21% LLM-Match Success on A-EQA (+9.7% over baseline) and encouraging physical robot transfer.
SimpleTES scales test-time evaluation in LLMs to discover state-of-the-art solutions on 21 scientific problems across six domains, outperforming frontier models and optimization pipelines with examples like 2x faster LASSO and new Erdos constructions.
CAF-Gen uses an iterative multi-agent creator-reviewer process to enrich shallow argument mining outputs into structurally richer CAF-compliant models with claimed improvements over single-pass generation.
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
citing papers explorer
No citing papers match the current filters.