CAPS is a four-stage inference-only cascade that adapts how much of each solution the verifier sees and how comparisons are distributed, halving per-candidate verifier tokens while outperforming uniform pairwise verification on most benchmarks.
Mm algorithms for generalized bradley-terry models.The annals of statistics, 32(1):384–406
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Gradient analysis and ablations show DPO and PPO have different target directions and component roles in preference optimization for LLMs.
citing papers explorer
-
CAPS: Cascaded Adaptive Pairwise Selection for Efficient Parallel Reasoning
CAPS is a four-stage inference-only cascade that adapts how much of each solution the verifier sees and how comparisons are distributed, halving per-candidate verifier tokens while outperforming uniform pairwise verification on most benchmarks.
-
What Is Preference Optimization Doing, and Why?
Gradient analysis and ablations show DPO and PPO have different target directions and component roles in preference optimization for LLMs.