Humanoid-vla: Towards universal humanoid control with visual inte- gration

Pengxiang Ding, Jianfei Ma, Xinyang Tong, Binghong Zou, Xinxin Luo, Yiguo Fan, Ting Wang, Hongchao Lu, Panzhong Mo, Jinxin Liu, et al · 2025 · arXiv 2502.14795

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Dynamic Full-body Motion Agent with Object Interaction via Blending Pre-trained Modular Controllers

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

A two-stage framework augments HOI data with dynamic priors and blends pre-trained dynamic motion and static interaction agents via a composer network to enable long-term dynamic human-object interactions with higher success rates and reduced training time.

Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot

cs.RO · 2026-04-23 · unverdicted · novelty 6.0

The Weightlessness Mechanism lets humanoid robots imitate non-self-stabilizing motions by dynamically relaxing specific joints to exploit passive environmental contacts, generalizing from single demonstrations to varied setups.

HEX: Humanoid-Aligned Experts for Cross-Embodiment Whole-Body Manipulation

cs.RO · 2026-04-09 · unverdicted · novelty 6.0 · 2 refs

HEX introduces a state-centric framework with humanoid-aligned representations and mixture-of-experts proprioceptive prediction for coordinated whole-body control on bipedal humanoids.

VLANeXt: Recipes for Building Strong VLA Models

cs.CV · 2026-02-20 · conditional · novelty 6.0

VLANeXt distills 12 design insights from a unified VLA study into a model that outperforms prior methods on LIBERO benchmarks while releasing code for further exploration.

Self-Supervised Bootstrapping of Action-Predictive Embodied Reasoning

cs.RO · 2026-02-09 · unverdicted · novelty 6.0

R&B-EnCoRe uses self-supervised importance-weighted variational inference to distill action-predictive reasoning datasets that improve VLA performance on manipulation, navigation, and driving tasks without external verifiers.

Commanding Humanoid by Free-form Language: A Large Language Action Model with Unified Motion Vocabulary

cs.RO · 2025-11-28 · unverdicted · novelty 6.0

Humanoid-LLA converts unconstrained natural language commands into stable whole-body motions for humanoid robots using a unified motion vocabulary and two-stage supervised-plus-reinforcement fine-tuning.

AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

cs.CV · 2025-07-17 · unverdicted · novelty 6.0

AnyPos automates task-agnostic action collection and inverse-dynamics modeling with arm/end-effector decoupling plus a direction-aware decoder, delivering 51% higher test accuracy and 30-40% better success rates on bimanual tasks.

Humanoid Whole-Body Manipulation via Active Spatial Brain and Generalizable Action Cerebellum

cs.RO · 2026-05-20 · unverdicted · novelty 5.0

A multi-agent LLM framework for humanoid loco-manipulation that separates active spatial perception and task planning from generalizable action generation without task-specific real-robot data.

Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey

cs.RO · 2025-08-18 · unverdicted · novelty 5.0

This survey organizes large VLM-based VLA models for robotic manipulation into monolithic and hierarchical paradigms, reviews their integrations and datasets, and outlines future directions.

citing papers explorer

Showing 9 of 9 citing papers.

Dynamic Full-body Motion Agent with Object Interaction via Blending Pre-trained Modular Controllers cs.CV · 2026-05-12 · unverdicted · none · ref 10
A two-stage framework augments HOI data with dynamic priors and blends pre-trained dynamic motion and static interaction agents via a composer network to enable long-term dynamic human-object interactions with higher success rates and reduced training time.
Learn Weightlessness: Imitate Non-Self-Stabilizing Motions on Humanoid Robot cs.RO · 2026-04-23 · unverdicted · none · ref 7
The Weightlessness Mechanism lets humanoid robots imitate non-self-stabilizing motions by dynamically relaxing specific joints to exploit passive environmental contacts, generalizing from single demonstrations to varied setups.
HEX: Humanoid-Aligned Experts for Cross-Embodiment Whole-Body Manipulation cs.RO · 2026-04-09 · unverdicted · none · ref 11 · 2 links
HEX introduces a state-centric framework with humanoid-aligned representations and mixture-of-experts proprioceptive prediction for coordinated whole-body control on bipedal humanoids.
VLANeXt: Recipes for Building Strong VLA Models cs.CV · 2026-02-20 · conditional · none · ref 9
VLANeXt distills 12 design insights from a unified VLA study into a model that outperforms prior methods on LIBERO benchmarks while releasing code for further exploration.
Self-Supervised Bootstrapping of Action-Predictive Embodied Reasoning cs.RO · 2026-02-09 · unverdicted · none · ref 43
R&B-EnCoRe uses self-supervised importance-weighted variational inference to distill action-predictive reasoning datasets that improve VLA performance on manipulation, navigation, and driving tasks without external verifiers.
Commanding Humanoid by Free-form Language: A Large Language Action Model with Unified Motion Vocabulary cs.RO · 2025-11-28 · unverdicted · none · ref 8
Humanoid-LLA converts unconstrained natural language commands into stable whole-body motions for humanoid robots using a unified motion vocabulary and two-stage supervised-plus-reinforcement fine-tuning.
AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation cs.CV · 2025-07-17 · unverdicted · none · ref 11
AnyPos automates task-agnostic action collection and inverse-dynamics modeling with arm/end-effector decoupling plus a direction-aware decoder, delivering 51% higher test accuracy and 30-40% better success rates on bimanual tasks.
Humanoid Whole-Body Manipulation via Active Spatial Brain and Generalizable Action Cerebellum cs.RO · 2026-05-20 · unverdicted · none · ref 7
A multi-agent LLM framework for humanoid loco-manipulation that separates active spatial perception and task planning from generalizable action generation without task-specific real-robot data.
Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey cs.RO · 2025-08-18 · unverdicted · none · ref 198
This survey organizes large VLM-based VLA models for robotic manipulation into monolithic and hierarchical paradigms, reviews their integrations and datasets, and outlines future directions.

Humanoid-vla: Towards universal humanoid control with visual inte- gration

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer