LiteGUI trains 2B/3B-scale GUI agents via SFT-free guided on-policy distillation and multi-solution dual-level GRPO to reach SOTA lightweight performance and compete with larger models.
Advances in Neural Information Processing Systems , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.AI 3representative citing papers
Longer action horizons bottleneck LLM agent training through instability, but training with reduced horizons stabilizes learning and enables better generalization to longer horizons.
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.
citing papers explorer
-
LiteGUI: Distilling Compact GUI Agents with Reinforcement Learning
LiteGUI trains 2B/3B-scale GUI agents via SFT-free guided on-policy distillation and multi-solution dual-level GRPO to reach SOTA lightweight performance and compete with larger models.
-
On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length
Longer action horizons bottleneck LLM agent training through instability, but training with reduced horizons stabilizes learning and enables better generalization to longer horizons.
-
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence
The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.