Prudent-Banker achieves pseudo-regret Õ(√T + √D) and Õ(1) regret vs. safe comparator in adversarial bandits both with and without delays, matching new lower bounds up to logs.
Some aspects of the sequential design of experiments
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Active inference adapts label collection via ML uncertainty to deliver valid statistical inference with substantially fewer samples than standard non-adaptive methods across any data distribution.
citing papers explorer
-
Prudent-Banker: No Extra Fees for Baseline Safety in Adversarial Bandits With and Without Delays
Prudent-Banker achieves pseudo-regret Õ(√T + √D) and Õ(1) regret vs. safe comparator in adversarial bandits both with and without delays, matching new lower bounds up to logs.
-
Active Statistical Inference
Active inference adapts label collection via ML uncertainty to deliver valid statistical inference with substantially fewer samples than standard non-adaptive methods across any data distribution.