A deep Q-network learns Atari control policies from raw pixels and beats prior algorithms on six games while surpassing humans on three.
The arcade learning environment: An evaluation platform for general agents
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
dataset 1polarities
use dataset 1representative citing papers
Decision Transformer casts RL as autoregressive sequence modeling conditioned on desired returns, past states and actions, matching or exceeding offline RL baselines on Atari, Gym and Key-to-Door tasks.
MuZero matches or exceeds AlphaZero-level performance in Go, Chess, Shogi and sets a new state of the art on 57 Atari games by learning a model that directly supports planning rather than reconstructing full environment dynamics.
A rationale is presented for developing an assistant in Minecraft to advance natural language understanding and dialogue learning.
citing papers explorer
-
Playing Atari with Deep Reinforcement Learning
A deep Q-network learns Atari control policies from raw pixels and beats prior algorithms on six games while surpassing humans on three.
-
Decision Transformer: Reinforcement Learning via Sequence Modeling
Decision Transformer casts RL as autoregressive sequence modeling conditioned on desired returns, past states and actions, matching or exceeding offline RL baselines on Atari, Gym and Key-to-Door tasks.
-
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
MuZero matches or exceeds AlphaZero-level performance in Go, Chess, Shogi and sets a new state of the art on 57 Atari games by learning a model that directly supports planning rather than reconstructing full environment dynamics.
-
Why Build an Assistant in Minecraft?
A rationale is presented for developing an assistant in Minecraft to advance natural language understanding and dialogue learning.