Model-free learning for two- player zero-sum partially observable markov games with perfect recall.ArXiv, abs/2106.06279

Tadashi Kozuno, Pierre M’enard, Rémi Munos, Michal Valko · 2021 · arXiv 2106.06279

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

cs.CR · 2026-05-06 · unverdicted · novelty 7.0

An algorithm achieves Õ(√(A ln(S) T)/ε) regret for extensive-form bandits under ε-local differential privacy, claimed as the first such result.

Showing 1 of 1 citing paper.

Differential Privacy in the Extensive-Form Bandit Problem cs.CR · 2026-05-06 · unverdicted · none · ref 12
An algorithm achieves Õ(√(A ln(S) T)/ε) regret for extensive-form bandits under ε-local differential privacy, claimed as the first such result.