{"total":13,"items":[{"citing_arxiv_id":"2607.01088","ref_index":24,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ROSA: A Robotics Foundation Model Serving System for Robot Factories","primary_cat":"cs.RO","submitted_at":"2026-07-01T15:45:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ROSA introduces shared GPU-pool serving, robotics-aware abstractions for multi-model pipelines, and factory-productivity scheduling that improves output by up to 12.06x over dedicated per-robot systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.09215","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MotionWAM: Towards Foundation World Action Models for Real-Time Humanoid Loco-Manipulation","primary_cat":"cs.RO","submitted_at":"2026-06-08T08:50:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MotionWAM conditions a policy on intermediate features from a video world model to predict unified whole-body motion tokens, enabling real-time humanoid loco-manipulation that outperforms VLA baselines by over 30% on nine Unitree G1 tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.07934","ref_index":4,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"X-OP: Cross-Morphology Whole-Body Teleoperation via MPC Retargeting","primary_cat":"cs.RO","submitted_at":"2026-06-06T01:50:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MPC-based retargeting framework enables cross-morphology whole-body teleoperation from a single XR device via dynamic feasibility optimization, state synchronization, and SLAM feedback, with reported gains in simulation and real-world tests.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.05687","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Accelerating and Scaling MPC-Guided Reinforcement Learning for Humanoid Locomotion and Manipulation","primary_cat":"cs.RO","submitted_at":"2026-06-04T04:12:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MPC-RL combines a centroidal-dynamics MPC reward with a batched GPU solver (π^n MPC) to accelerate RL training for humanoid locomotion and manipulation tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00576","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Dynamic Resilient Spatio-Semantic Memory with Hybrid Localization for Mobile Manipulation","primary_cat":"cs.RO","submitted_at":"2026-05-30T06:58:03+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"DREAM is a mobile manipulation system that constructs online spatio-semantic voxel memory with redundancy-aware pruning and hybrid language-vision localization, reporting higher long-horizon success rates than DynaMem in dynamic lab scenes.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.21133","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Humanoid Whole-Body Manipulation via Active Spatial Brain and Generalizable Action Cerebellum","primary_cat":"cs.RO","submitted_at":"2026-05-20T13:05:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A multi-agent LLM framework for humanoid loco-manipulation that separates active spatial perception and task planning from generalizable action generation without task-specific real-robot data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.03452","ref_index":5,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"BifrostUMI: Bridging Robot-Free Demonstrations and Humanoid Whole-Body Manipulation","primary_cat":"cs.RO","submitted_at":"2026-05-05T07:35:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BifrostUMI enables robot-free human demonstration capture via VR and wrist cameras to train visuomotor policies that predict keypoint trajectories for transfer to humanoid whole-body control through retargeting.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Wetzstein, and C. Finn, \"Humanplus: Humanoid shadowing and imitation from humans,\" 2024. [Online]. Available: https://arxiv.org/abs/2406.10454 [4] Y . Li, Y . Lin, J. Cui, T. Liu, W. Liang, Y . Zhu, and S. Huang, \"Clone: Closed-loop whole-body humanoid teleoperation for long-horizon tasks,\" 2025. [Online]. Available: https://arxiv.org/abs/2506.08931 [5] J. Li, X. Cheng, T. Huang, S. Yang, R.-Z. Qiu, and X. Wang, \"Amo: Adaptive motion optimization for hyper- dexterous humanoid whole-body control,\" 2025. [Online]. Available: https://arxiv.org/abs/2505.03738 [6] C. Lu, X. Cheng, J. Li, S. Yang, M. Ji, C. Yuan, G. Yang, S. Yi, and X. Wang, \"Mobile-television: Predictive motion priors for humanoid whole-body control,\" 2025."},{"citing_arxiv_id":"2605.00078","ref_index":124,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Being-H0.7: A Latent World-Action Model from Egocentric Videos","primary_cat":"cs.RO","submitted_at":"2026-04-30T14:16:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Being-H0.7 adds future-aware latent reasoning to direct VLA policies via dual-branch alignment on latent queries, matching world-model benefits at VLA efficiency.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"In Unitree G1, the policy exposes a 26-DoF upper-body action interface,i.e., 14 arm joints plus 12 Linkerbot O6 hand joints. Franka FR3 provides a 7-DoF arm paired with a single Linkerbot O6 hand. ForUnitree G1, the policy still exposes the same 26-DoF action interface used by the rest of our deployment stack. The additional backend is a pretrainedAMOcontroller [124], used as the balance-aware low-level whole-body module for humanoid execution. In our integration, AMO owns the 50Hz Unitree body-control loop, predicts lower-body and waist commands conditioned on the latest upper-arm targets, and composes the final body command for execution, while the Linkerbot O6 hands remain controlled through the same hand interface as the other embodiments."},{"citing_arxiv_id":"2604.13015","ref_index":21,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning Versatile Humanoid Manipulation with Touch Dreaming","primary_cat":"cs.RO","submitted_at":"2026-04-14T17:54:17+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"HTD, a multimodal transformer policy trained with behavioral cloning and touch dreaming to predict future tactile latents, achieves a 90.9% relative success rate improvement over baselines on five real-world contact-rich humanoid loco-manipulation tasks.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"HTD consistently improves over baselines, with latent tactile prediction outperforming raw tactile prediction. II. RELATEDWORK TABLE I: Comparisons to previous humanoid manipulation learn- ing systems Method End-Effector Dexterity Whole- Body Touch Sensing Touch Modeling OmniH2O [1] Dex-Hand Full✓ ✗ ✗ HumanPlus [5] Dex-Hand Full✓ ✗ ✗ Mobile-TeleVision [20] Dex-Hand Full✓ ✗ ✗ AMO [21] Dex-Hand Full✓ ✗ ✗ ViTacFormer [12] Dex-Hand Full✗ ✓ ✓ TWIST2 [7] Dex-Hand Open/Close✓ ✗ ✗ ViTac Humanoid [22] Dex-Hand Full✗ ✓ ✗ SONIC [6] Dex-Hand Open/Close✓ ✗ ✗ Humanoid UMI [23] Gripper✓ ✗ ✗ HumDex [24] Dex-Hand Full✓ ✗ ✗ OursDex-Hand Full✓ ✓ ✓ A. Humanoid Whole-Body Control and Teleoperation for Manipulation Recent progress in humanoid manipulation has been en-"},{"citing_arxiv_id":"2602.11758","ref_index":28,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"HAIC: Humanoid Agile Object Interaction Control via Dynamics-Aware World Model","primary_cat":"cs.RO","submitted_at":"2026-02-12T09:34:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"HAIC enables robust humanoid interactions with underactuated objects by predicting their dynamics from proprioceptive history and using a world model for adaptive control.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.06571","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Learning Agile Striker Skills for Humanoid Soccer Robots from Noisy Sensory Input","primary_cat":"cs.RO","submitted_at":"2025-12-06T21:27:50+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A four-stage RL system with teacher-student distillation and online constrained adaptation enables humanoid robots to achieve robust ball-kicking accuracy under noisy perception in simulation and on physical hardware.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2511.07820","ref_index":26,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control","primary_cat":"cs.RO","submitted_at":"2025-11-11T04:37:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Scaling motion tracking models along size, data volume, and compute produces a foundation model for natural, robust humanoid whole-body control with downstream uses in kinematic planning and vision-language-action models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.25241","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"One-shot Adaptation of Humanoid Whole-body Motion with Walking Priors","primary_cat":"cs.RO","submitted_at":"2025-10-29T07:48:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A one-shot adaptation technique for humanoid whole-body motion that computes order-preserving optimal transport distances between walking and target sequences, interpolates geodesic intermediate poses, optimizes for collision-free retargeting, and adapts via reinforcement learning.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}