arXiv preprint arXiv:2010.13611 , year=

Opal: Offline primitive discovery for accelerating offline reinforcement learning , author= · 2010 · arXiv 2010.13611

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Decision Transformer: Reinforcement Learning via Sequence Modeling

cs.LG · 2021-06-02 · accept · novelty 8.0

Decision Transformer casts RL as autoregressive sequence modeling conditioned on desired returns, past states and actions, matching or exceeding offline RL baselines on Atari, Gym and Key-to-Door tasks.

Implicit Safety Alignment from Crowd Preferences

cs.AI · 2026-05-20 · unverdicted · novelty 6.0

A hierarchical framework extracts implicit safety criteria from crowd preferences and composes them via high-level policy to reduce safety violations in downstream RL tasks without explicit safety rewards.

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation

cs.RO · 2021-08-06 · accept · novelty 6.0

A comprehensive benchmark study of offline imitation learning methods on multi-stage robot manipulation tasks identifies key sensitivities to algorithm design, data quality, and stopping criteria while releasing all datasets and code.

citing papers explorer

Showing 3 of 3 citing papers.

Decision Transformer: Reinforcement Learning via Sequence Modeling cs.LG · 2021-06-02 · accept · none · ref 34
Decision Transformer casts RL as autoregressive sequence modeling conditioned on desired returns, past states and actions, matching or exceeding offline RL baselines on Atari, Gym and Key-to-Door tasks.
Implicit Safety Alignment from Crowd Preferences cs.AI · 2026-05-20 · unverdicted · none · ref 39
A hierarchical framework extracts implicit safety criteria from crowd preferences and composes them via high-level policy to reduce safety violations in downstream RL tasks without explicit safety rewards.
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation cs.RO · 2021-08-06 · accept · none · ref 48
A comprehensive benchmark study of offline imitation learning methods on multi-stage robot manipulation tasks identifies key sensitivities to algorithm design, data quality, and stopping criteria while releasing all datasets and code.

arXiv preprint arXiv:2010.13611 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer