arXiv preprint arXiv:2402.01878 , year=

URL https://arxiv · 2024 · arXiv 2402.01878

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Response Time Enhances Alignment with Heterogeneous Preferences

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.

Threshold-Guided Optimization for Visual Generative Models

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

A threshold-guided alignment method lets visual generative models be optimized directly from scalar human ratings instead of requiring paired preference data.

The Differences Between Direct Alignment Algorithms are a Blur

cs.LG · 2025-02-03 · unverdicted · novelty 6.0

A controlled unification of direct alignment algorithms shows the ranking objective (pairwise vs pointwise) drives alignment quality more than the scalar score optimized.

UNA: A Unified Supervised Framework for Efficient LLM Alignment Across Feedback Types

cs.LG · 2024-08-27 · unverdicted · novelty 6.0

UNA unifies binary, pairwise, and score-based feedback for LLM alignment via a generalized implicit reward function shown optimal by the log sum inequality.

citing papers explorer

Showing 4 of 4 citing papers.

Response Time Enhances Alignment with Heterogeneous Preferences cs.LG · 2026-05-07 · unverdicted · none · ref 160
Response times modeled as drift-diffusion processes enable consistent estimation of population-average preferences from heterogeneous anonymous binary choices.
Threshold-Guided Optimization for Visual Generative Models cs.LG · 2026-05-06 · unverdicted · none · ref 30
A threshold-guided alignment method lets visual generative models be optimized directly from scalar human ratings instead of requiring paired preference data.
The Differences Between Direct Alignment Algorithms are a Blur cs.LG · 2025-02-03 · unverdicted · none · ref 22
A controlled unification of direct alignment algorithms shows the ranking objective (pairwise vs pointwise) drives alignment quality more than the scalar score optimized.
UNA: A Unified Supervised Framework for Efficient LLM Alignment Across Feedback Types cs.LG · 2024-08-27 · unverdicted · none · ref 9
UNA unifies binary, pairwise, and score-based feedback for LLM alignment via a generalized implicit reward function shown optimal by the log sum inequality.

arXiv preprint arXiv:2402.01878 , year=

fields

years

verdicts

representative citing papers

citing papers explorer