Title resolution pending

Produce a scalar reward in[−1, 1]: • exact label match→about+0

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Co-Evolution of Policy and Internal Reward for Language Agents

cs.LG · 2026-04-03 · unverdicted · novelty 6.0

Language agents improve long-horizon performance by generating and refining their own internal reward signals that guide actions at inference and provide denser supervision during training.

citing papers explorer

Showing 1 of 1 citing paper.

Co-Evolution of Policy and Internal Reward for Language Agents cs.LG · 2026-04-03 · unverdicted · none · ref 3
Language agents improve long-horizon performance by generating and refining their own internal reward signals that guide actions at inference and provide denser supervision during training.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer