← back to paper
arxiv: 2605.11853 · 2 revisions
GEAR: Granularity-Adaptive Advantage Reweighting for LLM Agents via Self-Distillation