DeScore decouples CoT reasoning from reward scoring in video reward models using a two-stage training process to improve generalization and avoid optimization bottlenecks of coupled generative RMs.
Genai arena: An open evaluation platform for generative models.Advances in Neural Information Processing Systems, 37:79889–79908
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
dataset 1
citation-polarity summary
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1roles
dataset 1polarities
use dataset 1representative citing papers
citing papers explorer
-
Think, then Score: Decoupled Reasoning and Scoring for Video Reward Modeling
DeScore decouples CoT reasoning from reward scoring in video reward models using a two-stage training process to improve generalization and avoid optimization bottlenecks of coupled generative RMs.