UMI-Bench 1.0 is presented as the first open benchmark dedicated to reproducible real-world evaluation of Universal Manipulation Interface policies.
arXiv preprint arXiv:2510.17950 , year =
16 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 16roles
background 3representative citing papers
PhAIL provides an open benchmark and distributional evaluation method for real-robot VLA policies using time-to-success CDF, HRT scoring, and KS significance tests.
SIMPLE is a new large-scale simulation benchmark for humanoid loco-manipulation that integrates accurate dynamics and photorealistic rendering and demonstrates policy transfer from simulation to physical robots.
PACE dynamically selects execution horizons for action chunks in robot policies by detecting low-speed transition points in predicted speed profiles, raising success rates from 57.8% to 64.2% on 50 simulation tasks and from 50.7% to 70.4% in real-robot tests.
X-DiffVLA proposes a diffusion VLA model using Embodiment Forcing and Morphological Tree Diffusion to achieve SOTA cross-embodied performance on simulation benchmarks with 15.3% and 12.5% gains.
A1 is a transparent VLA framework achieving state-of-the-art robot manipulation success with up to 72% lower latency via adaptive layer truncation and inter-layer flow matching.
LingBot-VLA is a VLA foundation model trained on massive real robot data that shows superior generalization across tasks and platforms with fast training throughput.
PACT is a self-evolving post-training framework that projects diffusion policies onto constraint-feasible regions via reverse-KL distillation and a tightening curriculum, reporting 31% fewer safety violations and 30.7% higher task success on embodied manipulation benchmarks.
Wall-OSS-0.5 is a 4B VLA model pretrained across many embodiments that achieves zero-shot real-robot performance on a 17-task suite and outperforms π_0.5 after fine-tuning.
VLA-REPLICA is a low-cost and reproducible real-world benchmark for evaluating VLA models in robotic manipulation tasks.
ProcVLM learns procedure-grounded dense progress rewards for robotic manipulation via a reasoning-before-estimation VLM trained on a 60M-frame synthesized corpus from 30 embodied datasets.
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
JoyAI-Sim provides bidirectional Robot-Simulation-Human pathways for aligned model evaluation and data generation in robotics using the JoySim simulator as an evaluation layer and physical consistency filter.
Automation in embodied benchmark construction shifts costs from acquisition toward validation, auditability, version control, and long-term governance instead of simply lowering total cost.
This survey reviews trends, challenges, benchmarks, and future directions in action-conditioned interactive world modeling for video and 3D generation.
citing papers explorer
No citing papers match the current filters.