arXiv preprint arXiv:2412.09858 (2024) 9

· 2024 · arXiv 2412.09858

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Improving Robotic Generalist Policies via Flow Reversal Steering

cs.RO · 2026-06-11 · unverdicted · novelty 7.0

Flow Reversal Steering steers flow matching generalist policies by reversing suboptimal actions to nearby better modes, enabling improved zero-shot control, quick distillation, and RL bootstrapping in robotic manipulation.

${\pi}_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities

cs.LG · 2026-04-16 · unverdicted · novelty 7.0

π₀.₇ is a steerable generalist robotic model that uses rich multimodal prompts including language, subgoal images, and performance metadata to achieve out-of-the-box generalization across tasks and robot bodies.

Flow-based Policy Adaptation without Policy Updates

cs.RO · 2026-06-04 · unverdicted · novelty 6.0

GLOVES learns flow models from limited expert demonstrations to selectively correct actions from non-expert policies or operators toward expert distributions using reverse-flow OOD detection as an intervention gate.

Learning While Deploying: Fleet-Scale Reinforcement Learning for Generalist Robot Policies

cs.RO · 2026-05-01 · unverdicted · novelty 6.0 · 2 refs

LWD is a fleet-scale offline-to-online RL framework that continually improves pretrained VLA policies using autonomous rollouts and human interventions, reaching 95% average success on real-world manipulation tasks.

$\pi^{*}_{0.6}$: a VLA That Learns From Experience

cs.LG · 2025-11-18 · unverdicted · novelty 6.0

RECAP enables a generalist VLA to self-improve via advantage-conditioned RL on mixed real-world data, more than doubling throughput and halving failure rates on hard manipulation tasks.

VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning

cs.RO · 2025-05-24 · conditional · novelty 6.0

VLA-RL applies online RL to pretrained VLAs, yielding a 4.5% gain over strong baselines on 40 LIBERO manipulation tasks and matching commercial models like π₀-FAST.

Scalable Multi-Task Data Generation via Reinforcement Learning for Language-Conditioned Bimanual Dexterous Manipulation

cs.RO · 2026-06-21 · unverdicted · novelty 5.0 · 2 refs

An RL data generation pipeline with generalizable rewards and language annotations produces diverse synthetic datasets that improve multi-task policy generalization on three bimanual manipulation tasks.

AllDayNav: Lifelong Navigation via Real-World Reinforcement Learning

cs.RO · 2026-06-09 · unverdicted · novelty 5.0

AllDayNav encodes scene dynamics into a large model's parameters via RL and a multimodal memory, achieving near-100% success rates in lifelong navigation and outperforming map-based and VLM baselines.

DexPIE: Stable Dexterous Policy Improvement from Real-World Experience

cs.RO · 2026-06-08 · unverdicted · novelty 5.0

DexPIE improves dexterous manipulation success rates by 37% over demo policies via real-world experience collection with adapted intervention, multi-stage DAgger, asynchronous relative-action inference, and optimality conditioning.

Multi-Objective Learning for Diffusion Models: A Statistical Theory under Semi-Supervised Learning

cs.LG · 2026-05-24 · unverdicted · novelty 5.0

A semi-supervised MOL framework for diffusion models with generalization bounds depending only on specialist model complexity, extended to diffusion policies for sequential decisions.

JoyAI-Sim: A Simulation-Enabled Interconversion Toolchain for the Embodied Data Pyramid

cs.RO · 2026-06-15 · unverdicted · novelty 3.0

JoyAI-Sim provides bidirectional Robot-Simulation-Human pathways for aligned model evaluation and data generation in robotics using the JoySim simulator as an evaluation layer and physical consistency filter.

citing papers explorer

Showing 10 of 10 citing papers after filters.

Improving Robotic Generalist Policies via Flow Reversal Steering cs.RO · 2026-06-11 · unverdicted · none · ref 48
Flow Reversal Steering steers flow matching generalist policies by reversing suboptimal actions to nearby better modes, enabling improved zero-shot control, quick distillation, and RL bootstrapping in robotic manipulation.
${\pi}_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities cs.LG · 2026-04-16 · unverdicted · none · ref 53
π₀.₇ is a steerable generalist robotic model that uses rich multimodal prompts including language, subgoal images, and performance metadata to achieve out-of-the-box generalization across tasks and robot bodies.
Flow-based Policy Adaptation without Policy Updates cs.RO · 2026-06-04 · unverdicted · none · ref 47
GLOVES learns flow models from limited expert demonstrations to selectively correct actions from non-expert policies or operators toward expert distributions using reverse-flow OOD detection as an intervention gate.
Learning While Deploying: Fleet-Scale Reinforcement Learning for Generalist Robot Policies cs.RO · 2026-05-01 · unverdicted · none · ref 30 · 2 links
LWD is a fleet-scale offline-to-online RL framework that continually improves pretrained VLA policies using autonomous rollouts and human interventions, reaching 95% average success on real-world manipulation tasks.
$\pi^{*}_{0.6}$: a VLA That Learns From Experience cs.LG · 2025-11-18 · unverdicted · none · ref 43
RECAP enables a generalist VLA to self-improve via advantage-conditioned RL on mixed real-world data, more than doubling throughput and halving failure rates on hard manipulation tasks.
Scalable Multi-Task Data Generation via Reinforcement Learning for Language-Conditioned Bimanual Dexterous Manipulation cs.RO · 2026-06-21 · unverdicted · none · ref 24 · 2 links
An RL data generation pipeline with generalizable rewards and language annotations produces diverse synthetic datasets that improve multi-task policy generalization on three bimanual manipulation tasks.
AllDayNav: Lifelong Navigation via Real-World Reinforcement Learning cs.RO · 2026-06-09 · unverdicted · none · ref 62
AllDayNav encodes scene dynamics into a large model's parameters via RL and a multimodal memory, achieving near-100% success rates in lifelong navigation and outperforming map-based and VLM baselines.
DexPIE: Stable Dexterous Policy Improvement from Real-World Experience cs.RO · 2026-06-08 · unverdicted · none · ref 14
DexPIE improves dexterous manipulation success rates by 37% over demo policies via real-world experience collection with adapted intervention, multi-stage DAgger, asynchronous relative-action inference, and optimality conditioning.
Multi-Objective Learning for Diffusion Models: A Statistical Theory under Semi-Supervised Learning cs.LG · 2026-05-24 · unverdicted · none · ref 17
A semi-supervised MOL framework for diffusion models with generalization bounds depending only on specialist model complexity, extended to diffusion policies for sequential decisions.
JoyAI-Sim: A Simulation-Enabled Interconversion Toolchain for the Embodied Data Pyramid cs.RO · 2026-06-15 · unverdicted · none · ref 48
JoyAI-Sim provides bidirectional Robot-Simulation-Human pathways for aligned model evaluation and data generation in robotics using the JoySim simulator as an evaluation layer and physical consistency filter.

arXiv preprint arXiv:2412.09858 (2024) 9

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer