When to use parametric models in reinforcement learning?

Hado van Hasselt, Matteo Hessel, John Aslanides · 1906 · arXiv 1906.05243

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

cs.LG · 2019-11-19 · accept · novelty 8.0

MuZero matches or exceeds AlphaZero-level performance in Go, Chess, Shogi and sets a new state of the art on 57 Atari games by learning a model that directly supports planning rather than reconstructing full environment dynamics.

Is Conditional Generative Modeling all you need for Decision-Making?

cs.LG · 2022-11-28 · unverdicted · novelty 6.0

Return-conditional diffusion models for policies outperform offline RL on benchmarks by circumventing dynamic programming and enable constraint or skill composition.

citing papers explorer

Showing 2 of 2 citing papers.

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model cs.LG · 2019-11-19 · accept · none · ref 45
MuZero matches or exceeds AlphaZero-level performance in Go, Chess, Shogi and sets a new state of the art on 57 Atari games by learning a model that directly supports planning rather than reconstructing full environment dynamics.
Is Conditional Generative Modeling all you need for Decision-Making? cs.LG · 2022-11-28 · unverdicted · none · ref 124
Return-conditional diffusion models for policies outperform offline RL on benchmarks by circumventing dynamic programming and enable constraint or skill composition.

When to use parametric models in reinforcement learning?

fields

years

verdicts

representative citing papers

citing papers explorer