A penalized bandit algorithm
classification
🧮 math.PR
keywords
algorithmconvergencedistributionlimitarmed-banditbanditcentralcharacterized
read the original abstract
We study a two armed-bandit algorithm with penalty. We show the convergence of the algorithm and establish the rate of convergence. For some choices of the parameters, we obtain a central limit theorem in which the limit distribution is characterized as the unique stationary distribution of a discontinuous Markov process.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.