hub

arXiv preprint arXiv:2410.00371 , year=

Jiafei Duan, Wilbert Pumacay, Nishanth Kumar, Yi Ru Wang, Shulin Tian, Wentao Yuan, Ranjay Krishna, Dieter Fox, Ajay Mandlekar, Yijie Guo · 2024 · arXiv 2410.00371

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 other 1

citation-polarity summary

background 1 unclear 1

representative citing papers

DreamAvoid: Critical-Phase Test-Time Dreaming to Avoid Failures in VLA Policies

cs.RO · 2026-05-12 · unverdicted · novelty 7.0

DreamAvoid uses a Dream Trigger, Action Proposer, and Dream Evaluator trained on success/failure/boundary data to let VLA policies avoid critical-phase failures via test-time future dreaming.

Health-Conditioned Vision-Language-Action Models for Malfunction-Aware Robot Control

cs.RO · 2026-05-15 · unverdicted · novelty 6.0

The paper introduces health-conditioned VLA models that incorporate a health vector via a new projector module and train on 128 malfunction episodes in the LIBERO simulator to complete tasks despite degraded joints.

From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation

cs.RO · 2026-05-12 · unverdicted · novelty 6.0

AgentChord models manipulation tasks as directed graphs enriched with anticipatory recovery branches, using specialized agents to enable immediate, low-latency failure responses and improve success on long-horizon bimanual tasks.

A Physical Agentic Loop for Language-Guided Grasping with Execution-State Monitoring

cs.RO · 2026-04-08 · unverdicted · novelty 6.0

A physical agentic loop with execution-state monitoring improves robustness of language-guided grasping over open-loop execution by converting noisy telemetry into discrete outcome events that trigger retries or user escalation.

Clutter-Robust Vision-Language-Action Models through Object-Centric and Geometry Grounding

cs.RO · 2025-12-27 · conditional · novelty 6.0

OBEYED-VLA improves VLA robustness in cluttered real-world manipulation by disentangling perception into VLM-based object-centric grounding and geometry-aware stages, then fine-tuning the policy only on single-object demonstrations.

RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields

cs.RO · 2024-12-03 · unverdicted · novelty 6.0

A deep RL vulnerability-prediction policy trained in semantic embedding space finds up to 23% more unique robot manipulation failures than vision-language baselines and enables more efficient fine-tuning.

Failing Forward: Adaptive Failure-Informed Learning for Vision-Language-Action Models

cs.RO · 2026-05-08 · unverdicted · novelty 5.0 · 2 refs

AFIL trains dual action generators on success and failure rollouts from a pretrained VLA to steer diffusion policies away from failure modes during inference.

Environmental Understanding Vision-Language Model for Embodied Agent

cs.CV · 2026-04-21 · unverdicted · novelty 5.0

EUEA fine-tunes VLMs on object perception, task planning, action understanding and goal recognition, with recovery and GRPO, to raise ALFRED success rates by 11.89% over behavior cloning.

Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models

cs.RO · 2025-10-22 · unverdicted · novelty 5.0

Hierarchical framework pairs in-context VLMs for high-level plan synthesis with RL-trained low-level skills and failure recovery to reach 92% success on long-horizon DLO routing across varied scenes and language inputs.

VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation

cs.RO · 2025-09-26 · unverdicted · novelty 5.0

VLBiMan framework enables generalizable bimanual manipulation from single human demonstrations via vision-language anchored task decomposition and adaptation without retraining.

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

cs.CV · 2025-07-22 · unverdicted · novelty 5.0

ThinkAct introduces reinforced visual latent planning in a dual VLA system to enable better long-horizon reasoning and adaptation for embodied tasks.

Sentinel-VLA: A Metacognitive VLA Model with Active Status Monitoring for Dynamic Reasoning and Error Recovery

cs.RO · 2026-05-02

citing papers explorer

Showing 12 of 12 citing papers.

DreamAvoid: Critical-Phase Test-Time Dreaming to Avoid Failures in VLA Policies cs.RO · 2026-05-12 · unverdicted · none · ref 31
DreamAvoid uses a Dream Trigger, Action Proposer, and Dream Evaluator trained on success/failure/boundary data to let VLA policies avoid critical-phase failures via test-time future dreaming.
Health-Conditioned Vision-Language-Action Models for Malfunction-Aware Robot Control cs.RO · 2026-05-15 · unverdicted · none · ref 7
The paper introduces health-conditioned VLA models that incorporate a health vector via a new projector module and train on 128 malfunction episodes in the LIBERO simulator to complete tasks despite degraded joints.
From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation cs.RO · 2026-05-12 · unverdicted · none · ref 13
AgentChord models manipulation tasks as directed graphs enriched with anticipatory recovery branches, using specialized agents to enable immediate, low-latency failure responses and improve success on long-horizon bimanual tasks.
A Physical Agentic Loop for Language-Guided Grasping with Execution-State Monitoring cs.RO · 2026-04-08 · unverdicted · none · ref 12
A physical agentic loop with execution-state monitoring improves robustness of language-guided grasping over open-loop execution by converting noisy telemetry into discrete outcome events that trigger retries or user escalation.
Clutter-Robust Vision-Language-Action Models through Object-Centric and Geometry Grounding cs.RO · 2025-12-27 · conditional · none · ref 22
OBEYED-VLA improves VLA robustness in cluttered real-world manipulation by disentangling perception into VLM-based object-centric grounding and geometry-aware stages, then fine-tuning the policy only on single-object demonstrations.
RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields cs.RO · 2024-12-03 · unverdicted · none · ref 5
A deep RL vulnerability-prediction policy trained in semantic embedding space finds up to 23% more unique robot manipulation failures than vision-language baselines and enables more efficient fine-tuning.
Failing Forward: Adaptive Failure-Informed Learning for Vision-Language-Action Models cs.RO · 2026-05-08 · unverdicted · none · ref 23 · 2 links
AFIL trains dual action generators on success and failure rollouts from a pretrained VLA to steer diffusion policies away from failure modes during inference.
Environmental Understanding Vision-Language Model for Embodied Agent cs.CV · 2026-04-21 · unverdicted · none · ref 10
EUEA fine-tunes VLMs on object perception, task planning, action understanding and goal recognition, with recovery and GRPO, to raise ALFRED success rates by 11.89% over behavior cloning.
Hierarchical DLO Routing with Reinforcement Learning and In-Context Vision-language Models cs.RO · 2025-10-22 · unverdicted · none · ref 26
Hierarchical framework pairs in-context VLMs for high-level plan synthesis with RL-trained low-level skills and failure recovery to reach 92% success on long-horizon DLO routing across varied scenes and language inputs.
VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation cs.RO · 2025-09-26 · unverdicted · none · ref 6
VLBiMan framework enables generalizable bimanual manipulation from single human demonstrations via vision-language anchored task decomposition and adaptation without retraining.
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning cs.CV · 2025-07-22 · unverdicted · none · ref 11
ThinkAct introduces reinforced visual latent planning in a dual VLA system to enable better long-horizon reasoning and adaptation for embodied tasks.
Sentinel-VLA: A Metacognitive VLA Model with Active Status Monitoring for Dynamic Reasoning and Error Recovery cs.RO · 2026-05-02 · unreviewed · ref 6

arXiv preprint arXiv:2410.00371 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer