The result is presented in Table 10

Change theMinDistanceoperation to sum of distance between hidden states in the same position, which isPn i=1 ∥x 1,i −x 2,i ∥ · 2025 · arXiv 1430.1420

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

cs.LG · 2026-05-28 · unverdicted · novelty 5.0

Introduces SVEB benchmark and Numca/Hista methods claiming more accurate state value estimates and better RL training performance for LLMs.

Showing 1 of 1 citing paper.

Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning cs.LG · 2026-05-28 · unverdicted · none · ref 19
Introduces SVEB benchmark and Numca/Hista methods claiming more accurate state value estimates and better RL training performance for LLMs.