pith. sign in

thinking while doing

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2025 1

verdicts

UNVERDICTED 1

representative citing papers

Agentic Reinforced Policy Optimization

cs.LG · 2025-07-26 · unverdicted · novelty 6.0

ARPO adds entropy-based adaptive rollouts and stepwise advantage attribution to RL for LLM agents, outperforming prior trajectory-level methods on 13 benchmarks with half the tool budget.

citing papers explorer

Showing 1 of 1 citing paper.

  • Agentic Reinforced Policy Optimization cs.LG · 2025-07-26 · unverdicted · none · ref 6

    ARPO adds entropy-based adaptive rollouts and stepwise advantage attribution to RL for LLM agents, outperforming prior trajectory-level methods on 13 benchmarks with half the tool budget.