An algorithm achieves Õ(√(A ln(S) T)/ε) regret for extensive-form bandits under ε-local differential privacy, claimed as the first such result.
Model-free learning for two- player zero-sum partially observable markov games with perfect recall.ArXiv, abs/2106.06279
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Differential Privacy in the Extensive-Form Bandit Problem
An algorithm achieves Õ(√(A ln(S) T)/ε) regret for extensive-form bandits under ε-local differential privacy, claimed as the first such result.