The Method of Paired Comparisons , author=

Rank Analysis of Incomplete Block Designs: I

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Ask the Right Comparison:Bias-Aware Bayesian Active Top-$k$ Ranking with LLM Judges

cs.LG · 2026-07-02 · unverdicted · novelty 6.0

A bias-aware Bayesian model with judge-specific covariates and a top-k membership uncertainty acquisition rule recovers accurate top-k rankings from noisy LLM judges using fewer comparisons than naive aggregation or standard active learning.

Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Vision-Language Models

cs.AI · 2026-05-05 · unverdicted · novelty 6.0

MoR lets clients train local reward models on private preferences and uses a learned Mixture-of-Rewards with GRPO on the server to align a shared base VLM without exchanging parameters, architectures, or raw data.

Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback

cs.LG · 2026-04-30 · unverdicted · novelty 6.0

DRRO for RLHF minimizes worst-case regret relative to the best policy under Wasserstein reward perturbations, yielding an exact inner solution and water-filling policy structure for the promptwise simplex model plus a practical policy-gradient algorithm.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Ask the Right Comparison:Bias-Aware Bayesian Active Top-$k$ Ranking with LLM Judges cs.LG · 2026-07-02 · unverdicted · none · ref 12
A bias-aware Bayesian model with judge-specific covariates and a top-k membership uncertainty acquisition rule recovers accurate top-k rankings from noisy LLM judges using fewer comparisons than naive aggregation or standard active learning.
Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback cs.LG · 2026-04-30 · unverdicted · none · ref 1
DRRO for RLHF minimizes worst-case regret relative to the best policy under Wasserstein reward perturbations, yielding an exact inner solution and water-filling policy structure for the promptwise simplex model plus a practical policy-gradient algorithm.

The Method of Paired Comparisons , author=

fields

years

verdicts

representative citing papers

citing papers explorer