pith. sign in

Veri-r1: Toward precise and faithful claim verification via online reinforcement learning.arXiv preprint arXiv:2510.01932,

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.CL 1

years

2026 1

verdicts

UNVERDICTED 1

clear filters

representative citing papers

Reinforcement Learning from Denoising Feedback

cs.CL · 2026-05-25 · unverdicted · novelty 5.0

RLDF is a new RL paradigm for diffusion language models that optimizes toward clipped clean states with weighted timestep sampling and reports substantial gains on reasoning benchmarks for LLaDA and Dream.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Reinforcement Learning from Denoising Feedback cs.CL · 2026-05-25 · unverdicted · none · ref 9

    RLDF is a new RL paradigm for diffusion language models that optimizes toward clipped clean states with weighted timestep sampling and reports substantial gains on reasoning benchmarks for LLaDA and Dream.