arXiv preprint arXiv:2507.10672 , year=

· 2025 · arXiv 2507.10672

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Bounding Boxes as Goals: Language-Conditioned Grasping via Neuro-Symbolic Planning

cs.RO · 2026-06-11 · unverdicted · novelty 6.0

GRASP maps natural language to bounding-box goals via VLM for neuro-symbolic planning and reports 73.3% success in 90 real-robot trials without task-specific training.

GaussianDream: A Feed-Forward 3D Gaussian World Model for Robotic Manipulation

cs.RO · 2026-05-20 · unverdicted · novelty 6.0

GaussianDream is a feed-forward 3D Gaussian world model plug-in that conditions VLA policies on learned 3D spatial and future evolution representations for improved robotic manipulation performance.

RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

cs.RO · 2026-05-13 · unverdicted · novelty 6.0

A co-evolutionary VLM-VGM loop on 500 unlabeled images raises planner success by 30 points and simulator success by 48 percent while beating fully supervised baselines.

AffordVLA: Injecting Affordance Representations into Vision-Language-Action Models via Implicit Feature Alignment

cs.RO · 2026-05-17 · unverdicted · novelty 5.0

AffordVLA improves VLA models for robotic manipulation by implicitly injecting affordance perception through feature alignment with a zero-shot teacher, claiming SOTA results in simulation and real-world tests.

STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation

cs.RO · 2026-04-29 · unverdicted · novelty 5.0

STARRY uses unified diffusion to align spatial-temporal world predictions with action generation plus GASAM for geometry-aware attention, reaching 93.82%/93.30% success on 50 bimanual tasks in simulation and raising real-world success from 42.5% to 70.8%.

Goal2Skill: Long-Horizon Manipulation with Adaptive Planning and Reflection

cs.RO · 2026-04-15 · unverdicted · novelty 5.0

A dual VLM-VLA framework for long-horizon robot manipulation achieves 32.4% success on RMBench tasks versus 9.8% for the strongest baseline via structured memory and closed-loop adaptive replanning.

Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap

cs.RO · 2026-04-15 · unverdicted · novelty 4.0

A survey of UAV vision-and-language navigation that establishes a methodological taxonomy, reviews resources and challenges, and proposes a forward-looking research roadmap.

From Video to Control: A Survey of Learning Manipulation Interfaces from Temporal Visual Data

cs.RO · 2026-04-04

LIBERO-PRO: Towards Robust and Fair Evaluation of Vision-Language-Action Models Beyond Memorization

cs.CV · 2025-10-04

citing papers explorer

Showing 7 of 7 citing papers after filters.

Bounding Boxes as Goals: Language-Conditioned Grasping via Neuro-Symbolic Planning cs.RO · 2026-06-11 · unverdicted · none · ref 3
GRASP maps natural language to bounding-box goals via VLM for neuro-symbolic planning and reports 73.3% success in 90 real-robot trials without task-specific training.
GaussianDream: A Feed-Forward 3D Gaussian World Model for Robotic Manipulation cs.RO · 2026-05-20 · unverdicted · none · ref 46
GaussianDream is a feed-forward 3D Gaussian world model plug-in that conditions VLA policies on learned 3D spatial and future evolution representations for improved robotic manipulation performance.
RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data cs.RO · 2026-05-13 · unverdicted · none · ref 52
A co-evolutionary VLM-VGM loop on 500 unlabeled images raises planner success by 30 points and simulator success by 48 percent while beating fully supervised baselines.
AffordVLA: Injecting Affordance Representations into Vision-Language-Action Models via Implicit Feature Alignment cs.RO · 2026-05-17 · unverdicted · none · ref 23
AffordVLA improves VLA models for robotic manipulation by implicitly injecting affordance perception through feature alignment with a zero-shot teacher, claiming SOTA results in simulation and real-world tests.
STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation cs.RO · 2026-04-29 · unverdicted · none · ref 13
STARRY uses unified diffusion to align spatial-temporal world predictions with action generation plus GASAM for geometry-aware attention, reaching 93.82%/93.30% success on 50 bimanual tasks in simulation and raising real-world success from 42.5% to 70.8%.
Goal2Skill: Long-Horizon Manipulation with Adaptive Planning and Reflection cs.RO · 2026-04-15 · unverdicted · none · ref 4
A dual VLM-VLA framework for long-horizon robot manipulation achieves 32.4% success on RMBench tasks versus 9.8% for the strongest baseline via structured memory and closed-loop adaptive replanning.
Vision-and-Language Navigation for UAVs: Progress, Challenges, and a Research Roadmap cs.RO · 2026-04-15 · unverdicted · none · ref 164
A survey of UAV vision-and-language navigation that establishes a methodological taxonomy, reviews resources and challenges, and proposes a forward-looking research roadmap.

arXiv preprint arXiv:2507.10672 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer