A collision-only coordinated distributed algorithm for max-min fair multi-agent bandits achieves O(N^3 f(log T) log T) regret while preserving local reward privacy.
Learning with bandit feedback in potential games,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2023 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Near-Optimal Privacy-Preserving Learning for Max-Min Fair Multi-Agent Bandits
A collision-only coordinated distributed algorithm for max-min fair multi-agent bandits achieves O(N^3 f(log T) log T) regret while preserving local reward privacy.