Q-chunking improves offline-to-online RL sample efficiency on long-horizon sparse-reward manipulation tasks by applying action chunking to TD learning.
The option-critic architecture
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.LG 3roles
method 1polarities
use method 1representative citing papers
SSE improves long-horizon goal-conditioned RL by using failure and partial-success transitions to identify unreliable subgoals, streamline high-level planning, and outperform prior hierarchical methods on benchmarks.