The work establishes a regret lower bound of Ω(T^{2/3} min(K,D)^{1/3}) for fair multi-user dueling bandits with heterogeneous Condorcet winners and gives algorithms achieving matching upper bounds up to logs.
Advances in Neural Information Processing Systems , volume=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multi-User Dueling Bandits: A Fair Approach using Nash Social Welfare
The work establishes a regret lower bound of Ω(T^{2/3} min(K,D)^{1/3}) for fair multi-user dueling bandits with heterogeneous Condorcet winners and gives algorithms achieving matching upper bounds up to logs.