Uses imitation learning from oracles to train an edge-evaluation policy for lazy graph search, outperforming heuristics on 2D and 7D motion planning problems when test instances are similar to training.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2019 2verdicts
UNVERDICTED 2representative citing papers
RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.
citing papers explorer
-
Leveraging Experience in Lazy Search
Uses imitation learning from oracles to train an edge-evaluation policy for lazy graph search, outperforming heuristics on 2D and 7D motion planning problems when test instances are similar to training.
-
Generalizing from a few environments in safety-critical reinforcement learning
RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.