RoboProcessBench is a new benchmark decomposing process-aware understanding into static monitoring and dynamic reasoning across 12 question families, with evaluations showing VLM limitations but post-training gains on the provided data.
RoboHiMan: A hierarchical evaluation paradigm for compositional generalization in long-horizon manipulation.arXiv preprint arXiv:2510.13149, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Adding recurrent memory tokens to VLA models raises success rates on partially observable manipulation tasks from 0.42 to 0.84 on training and 0.07 to 0.23 on held-out tasks while preserving performance under full observability.
citing papers explorer
-
RoboProcessBench: Benchmarking Process-Aware Understanding in Vision-Language Robotic Manipulation
RoboProcessBench is a new benchmark decomposing process-aware understanding into static monitoring and dynamic reasoning across 12 question families, with evaluations showing VLM limitations but post-training gains on the provided data.
-
$\mu$VLA: On Recurrent Memory for Partially Observable Manipulation in VLA Models
Adding recurrent memory tokens to VLA models raises success rates on partially observable manipulation tasks from 0.42 to 0.84 on training and 0.07 to 0.23 on held-out tasks while preserving performance under full observability.