pith. sign in

Interpretable preferences via multi-objective reward modeling and mixture-of-experts

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 2 method 1

citation-polarity summary

years

2026 4 2025 1

verdicts

UNVERDICTED 5

representative citing papers

RewardBench 2: Advancing Reward Model Evaluation

cs.CL · 2025-06-02 · unverdicted · novelty 6.0

RewardBench 2 is a new benchmark that supplies challenging fresh human prompts for reward model evaluation, yielding lower average scores but higher correlation with downstream best-of-N sampling and RLHF training performance.

citing papers explorer

Showing 5 of 5 citing papers.