Offline reinforcement learning as one big sequence modeling problem.Advances in neural information processing systems, 34:1273–1286, 2021

Michael Janner, Qiyang Li, Sergey Levine · 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Zero-Shot Signal Temporal Logic Planning with Disjunctive Branch Selection in Dynamic Semantic Maps

cs.AI · 2026-05-02 · unverdicted · novelty 6.0

A zero-shot STL planner combines a map-conditioned Transformer with a disjunctive heuristic and Transitive RL to achieve better generalization across dynamic semantic maps.

SeedPolicy: Horizon Scaling via Self-Evolving Diffusion Policy for Robot Manipulation

cs.RO · 2026-03-05 · conditional · novelty 6.0

SeedPolicy introduces self-evolving gated attention to extend the temporal horizon of diffusion policies, yielding 36.8% and 169% relative gains over standard DP on clean and randomized RoboTwin 2.0 tasks.

citing papers explorer

Showing 2 of 2 citing papers.

Zero-Shot Signal Temporal Logic Planning with Disjunctive Branch Selection in Dynamic Semantic Maps cs.AI · 2026-05-02 · unverdicted · none · ref 5
A zero-shot STL planner combines a map-conditioned Transformer with a disjunctive heuristic and Transitive RL to achieve better generalization across dynamic semantic maps.
SeedPolicy: Horizon Scaling via Self-Evolving Diffusion Policy for Robot Manipulation cs.RO · 2026-03-05 · conditional · none · ref 17
SeedPolicy introduces self-evolving gated attention to extend the temporal horizon of diffusion policies, yielding 36.8% and 169% relative gains over standard DP on clean and randomized RoboTwin 2.0 tasks.

Offline reinforcement learning as one big sequence modeling problem.Advances in neural information processing systems, 34:1273–1286, 2021

fields

years

verdicts

representative citing papers

citing papers explorer