pith. sign in

arXiv preprint arXiv:2508.03440 , year=

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.CL 2 cs.LG 1

years

2026 3

verdicts

UNVERDICTED 3

representative citing papers

LEPO: Latent Reasoning Policy Optimization for Large Language Models

cs.LG · 2026-04-20 · unverdicted · novelty 5.0

LEPO applies RL to continuous latent representations in LLMs by injecting Gumbel-Softmax stochasticity for diverse trajectory sampling and unified gradient estimation, outperforming existing discrete and latent RL methods.

citing papers explorer

Showing 3 of 3 citing papers.