pith. sign in

Latent thinking optimiza- tion: Your latent reasoning language model secretly en- codes reward signals in its latent thoughts.arXiv preprint arXiv:2509.26314,

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

years

2026 3

verdicts

UNVERDICTED 3

clear filters

representative citing papers

Sparse Reward Subsystem in Large Language Models

cs.CL · 2026-02-01 · unverdicted · novelty 6.0

LLM hidden states contain a sparse reward subsystem consisting of value neurons that predict state value and dopamine neurons that encode step-level temporal difference errors.

citing papers explorer

Showing 3 of 3 citing papers after filters.