arXiv preprint arXiv:2502.09724 , year=

Navigating the Social Welfare Frontier: Portfolios for Multi-objective Reinforcement Learning , author= · arXiv 2502.09724

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Multi-User Dueling Bandits: A Fair Approach using Nash Social Welfare

cs.LG · 2026-05-03 · unverdicted · novelty 7.0

The work establishes a regret lower bound of Ω(T^{2/3} min(K,D)^{1/3}) for fair multi-user dueling bandits with heterogeneous Condorcet winners and gives algorithms achieving matching upper bounds up to logs.

Why Global LLM Leaderboards Are Misleading: Small Portfolios for Heterogeneous Supervised ML

cs.LG · 2026-05-07 · conditional · novelty 6.0

Global Bradley-Terry rankings of LLMs are misleading due to structured heterogeneity in user preferences, and small (λ, ν)-portfolios recover coherent subpopulations that cover over 96% of votes with just five rankings.

citing papers explorer

Showing 2 of 2 citing papers.

Multi-User Dueling Bandits: A Fair Approach using Nash Social Welfare cs.LG · 2026-05-03 · unverdicted · none · ref 18
The work establishes a regret lower bound of Ω(T^{2/3} min(K,D)^{1/3}) for fair multi-user dueling bandits with heterogeneous Condorcet winners and gives algorithms achieving matching upper bounds up to logs.
Why Global LLM Leaderboards Are Misleading: Small Portfolios for Heterogeneous Supervised ML cs.LG · 2026-05-07 · conditional · none · ref 44
Global Bradley-Terry rankings of LLMs are misleading due to structured heterogeneity in user preferences, and small (λ, ν)-portfolios recover coherent subpopulations that cover over 96% of votes with just five rankings.

arXiv preprint arXiv:2502.09724 , year=

fields

years

verdicts

representative citing papers

citing papers explorer