{"paper":{"title":"Asymptotic Behavior of Minimal-Exploration Allocation Policies: Almost Sure, Arbitrarily Slow Growing Regret","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":["cs.LG"],"primary_cat":"stat.ML","authors_text":"Michael N. Katehakis, Wesley Cowan","submitted_at":"2015-05-12T03:35:47Z","abstract_excerpt":"The purpose of this paper is to provide further understanding into the structure of the sequential allocation (\"stochastic multi-armed bandit\", or MAB) problem by establishing probability one finite horizon bounds and convergence rates for the sample (or \"pseudo\") regret associated with two simple classes of allocation policies $\\pi$.\n  For any slowly increasing function $g$, subject to mild regularity constraints, we construct two policies (the $g$-Forcing, and the $g$-Inflated Sample Mean) that achieve a measure of regret of order $ O(g(n))$ almost surely as $n \\to \\infty$, bound from above "},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"1505.02865","kind":"arxiv","version":2},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}