Stacked Thompson Bandits

Lenz Belzner; Thomas Gabor

arxiv: 1702.08726 · v1 · pith:6JT2RELInew · submitted 2017-02-28 · 💻 cs.SE · cs.AI· cs.SY

Stacked Thompson Bandits

Lenz Belzner , Thomas Gabor This is my paper

classification 💻 cs.SE cs.AIcs.SY

keywords banditsplansthompsonguidesatisfysearchstackedaction

0 comments

read the original abstract

We introduce Stacked Thompson Bandits (STB) for efficiently generating plans that are likely to satisfy a given bounded temporal logic requirement. STB uses a simulation for evaluation of plans, and takes a Bayesian approach to using the resulting information to guide its search. In particular, we show that stacking multiarmed bandits and using Thompson sampling to guide the action selection process for each bandit enables STB to generate plans that satisfy requirements with a high probability while only searching a fraction of the search space.

This paper has not been read by Pith yet.

Stacked Thompson Bandits

discussion (0)