Shared Autonomy via Hindsight Optimization

· 2015 · cs.RO · arXiv 1503.07619

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

In shared autonomy, user input and robot autonomy are combined to control a robot to achieve a goal. Often, the robot does not know a priori which goal the user wants to achieve, and must both predict the user's intended goal, and assist in achieving that goal. We formulate the problem of shared autonomy as a Partially Observable Markov Decision Process with uncertainty over the user's goal. We utilize maximum entropy inverse optimal control to estimate a distribution over the user's goal based on the history of inputs. Ideally, the robot assists the user by solving for an action which minimizes the expected cost-to-go for the (unknown) goal. As solving the POMDP to select the optimal action is intractable, we use hindsight optimization to approximate the solution. In a user study, we compare our method to a standard predict-then-blend approach. We find that our method enables users to accomplish tasks more quickly while utilizing less input. However, when asked to rate each system, users were mixed in their assessment, citing a tradeoff between maintaining control authority and accomplishing tasks quickly.

representative citing papers

Learning Reward Functions by Integrating Human Demonstrations and Preferences

cs.RO · 2019-06-21 · unverdicted · novelty 6.0

DemPref uses demonstrations to form a coarse reward prior and ground active preference queries, achieving higher efficiency than pure preference learning and higher user preference than IRL in experiments.

citing papers explorer

Showing 1 of 1 citing paper.

Learning Reward Functions by Integrating Human Demonstrations and Preferences cs.RO · 2019-06-21 · unverdicted · none · ref 27 · internal anchor
DemPref uses demonstrations to form a coarse reward prior and ground active preference queries, achieving higher efficiency than pure preference learning and higher user preference than IRL in experiments.

Shared Autonomy via Hindsight Optimization

fields

years

verdicts

representative citing papers

citing papers explorer