Logarithmic regret for matrix games against an adversary with noisy bandit feedback.arXiv preprint arXiv:2306.13233, 2023

Arnab Maiti, Kevin Jamieson, Lillian J Ratliff · 2023 · arXiv 2306.13233

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Prudent-Banker: No Extra Fees for Baseline Safety in Adversarial Bandits With and Without Delays

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

Prudent-Banker achieves pseudo-regret Õ(√T + √D) and Õ(1) regret vs. safe comparator in adversarial bandits both with and without delays, matching new lower bounds up to logs.

citing papers explorer

Showing 1 of 1 citing paper.

Prudent-Banker: No Extra Fees for Baseline Safety in Adversarial Bandits With and Without Delays cs.LG · 2026-05-22 · unverdicted · none · ref 46
Prudent-Banker achieves pseudo-regret Õ(√T + √D) and Õ(1) regret vs. safe comparator in adversarial bandits both with and without delays, matching new lower bounds up to logs.

Logarithmic regret for matrix games against an adversary with noisy bandit feedback.arXiv preprint arXiv:2306.13233, 2023

fields

years

verdicts

representative citing papers

citing papers explorer