Asynchronous methods for deep reinforcement learning

V olodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu · 1928

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Generalizing from a few environments in safety-critical reinforcement learning

cs.LG · 2019-07-02 · unverdicted · novelty 6.0

RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.

Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives

cs.LG · 2019-06-25 · unverdicted · novelty 6.0

RL policies decompose into information-regularized primitives that compete by requesting state information amounts, with the greediest one acting, yielding better generalization than flat or hierarchical baselines.

citing papers explorer

Showing 2 of 2 citing papers.

Generalizing from a few environments in safety-critical reinforcement learning cs.LG · 2019-07-02 · unverdicted · none · ref 26
RL agents fail dangerously on unseen environments; ensembles reduce catastrophes in gridworld but not CoinRun, with uncertainty enabling intervention prediction.
Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives cs.LG · 2019-06-25 · unverdicted · none · ref 25
RL policies decompose into information-regularized primitives that compete by requesting state information amounts, with the greediest one acting, yielding better generalization than flat or hierarchical baselines.

Asynchronous methods for deep reinforcement learning

fields

years

verdicts

representative citing papers

citing papers explorer