Derives player-optimal regret O(K log T (1/Δ)^{2/α}) for CPT-weighted matching market bandits, improves to K-independent dominant term when K ≫ N via active arm selection, and gives logarithmic regret under known/unknown corruption budgets.
Competing Bandits in Non-Stationary Matching Markets
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Matching Markets meet Cumulative Prospect Theory: Towards Optimal and Adversarially Robust Learning
Derives player-optimal regret O(K log T (1/Δ)^{2/α}) for CPT-weighted matching market bandits, improves to K-independent dominant term when K ≫ N via active arm selection, and gives logarithmic regret under known/unknown corruption budgets.