← back to paper
arxiv: 2604.10547 · 2 revisions
Agent^2 RL-Bench: Can LLM Agents Engineer Agentic RL Post-Training?