← back to paper
arxiv: 2605.09287 · 2 revisions
PiCA: Pivot-Based Credit Assignment for Search Agentic Reinforcement Learning