Target-Aligned Bellman Backup (TABB) improves cross-domain offline RL by selecting source transitions according to their contribution to accurate target-domain Bellman target estimation.
Pre-training for robots: Offline rl en- ables learning new tasks from a handful of trials.arXiv preprint arXiv:2210.05178
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
RoboCasa supplies a large-scale kitchen simulator, generative assets, 100 tasks, and automated data pipelines that produce a clear scaling trend in imitation learning for generalist robots.
A GPT-style model pre-trained on large video datasets achieves 94.9% success on CALVIN multi-task manipulation and 85.4% zero-shot generalization, outperforming prior baselines.
VGAS uses best-of-N selection with a geometrically grounded critic and explicit regularization to improve success rates of few-shot VLA policies under limited data and distribution shifts.
MimicGen creates over 50K robot demonstrations from roughly 200 human ones, allowing imitation learning to achieve strong performance on complex long-horizon tasks like assembly and coffee preparation.
TinyVLA achieves faster inference and higher data efficiency than OpenVLA on robotic manipulation tasks by initializing from high-speed multimodal models and adding a diffusion policy decoder, without any pre-training phase.
citing papers explorer
-
Target-Aligned Bellman Backup for Cross-domain Offline Reinforcement Learning
Target-Aligned Bellman Backup (TABB) improves cross-domain offline RL by selecting source transitions according to their contribution to accurate target-domain Bellman target estimation.
-
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
RoboCasa supplies a large-scale kitchen simulator, generative assets, 100 tasks, and automated data pipelines that produce a clear scaling trend in imitation learning for generalist robots.
-
VGAS: Value-Guided Action-Chunk Selection for Few-Shot Vision-Language-Action Adaptation
VGAS uses best-of-N selection with a geometrically grounded critic and explicit regularization to improve success rates of few-shot VLA policies under limited data and distribution shifts.
-
MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations
MimicGen creates over 50K robot demonstrations from roughly 200 human ones, allowing imitation learning to achieve strong performance on complex long-horizon tasks like assembly and coffee preparation.
-
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
TinyVLA achieves faster inference and higher data efficiency than OpenVLA on robotic manipulation tasks by initializing from high-speed multimodal models and adding a diffusion policy decoder, without any pre-training phase.