MuJoCoUni:Persistent Batched Runtime Primitives for MuJoCo

Junzhe Wu; Yufei Jia

arxiv: 2605.24922 · v1 · pith:WOKW5II7new · submitted 2026-05-24 · 💻 cs.RO

MuJoCoUni:Persistent Batched Runtime Primitives for MuJoCo

Yufei Jia , Junzhe Wu This is my paper

Pith reviewed 2026-06-30 01:11 UTC · model grok-4.3

classification 💻 cs.RO

keywords MuJoCobatched simulationstateful environmentsrobot learningparallel executionphysics primitivesdomain randomization

0 comments

The pith

MuJoCoUni supplies batched stateful execution primitives that keep original MuJoCo semantics for models, sensors, contact and constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MuJoCoUni as a downstream distribution that adds runtime support for high-throughput parallel execution in online robot learning workloads. It introduces the BatchEnvPool executor to manage multiple independent environments while preserving the upstream CPU MuJoCo behavior on models, sensors, contact and constraints. The primitives include final-state-only short stepping, sparse reset with domain randomization, batched sensor forward passes that do not advance dynamics, and batched Jacobian and height-field queries. All changes are confined to the Python binding layer so that MuJoCo's solver, integrator and core contact model remain untouched. This approach is intended to meet the combined needs of speed and semantic fidelity for stateful simulation in robot learning.

Core claim

MuJoCoUni supplies runtime primitives for stateful environment execution through its core BatchEnvPool object, a C++/pybind11 executor that owns per-environment mjModel copies, per-thread mjData workers and an internal thread pool, delivering final-state-only short stepping, sparse reset, reset-lifecycle domain randomization, batched sensor forward evaluation without advancing dynamics, and batched Jacobian and height-field queries while retaining upstream CPU MuJoCo semantics.

What carries the argument

BatchEnvPool, a C++/pybind11 executor that owns per-environment mjModel copies, per-thread mjData workers, and an internal thread pool.

If this is right

Stateful parallel environments become available for online learning while models, sensors and contact remain identical to upstream MuJoCo.
Sparse reset and reset-lifecycle randomization can be applied without full environment reconstruction.
Sensor and Jacobian queries can be batched independently of dynamics stepping.
All new functionality stays inside the binding layer so core solver and contact model behavior is unchanged.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The thread-pool design could be extended to other CPU-bound simulators that need stateful batching without solver modification.
Workloads that currently interleave short steps with sensor queries may see reduced Python overhead.
The separation of reset and stepping lifecycles may simplify integration with reinforcement-learning frameworks that require on-the-fly parameter changes.

Load-bearing premise

The listed primitives are sufficient to satisfy both the throughput and semantic requirements of the target online robot learning workloads.

What would settle it

A direct comparison showing that mujoco.rollout already delivers equivalent throughput and full upstream semantics on the same workloads without additional primitives would falsify the need for MuJoCoUni.

Figures

Figures reproduced from arXiv: 2605.24922 by Junzhe Wu, Yufei Jia.

**Figure 1.** Figure 1: MuJoCoUni architecture. BatchEnvPool maintains persistent model and worker resources behind the Python interface and executes batched operations through standard MuJoCo calls without modifying the physics kernel. is not a replacement claim against GPU-resident simulation; it is a CPU-batched backend for MuJoCo workloads where feature coverage matters more than accelerator residency. 2.4 Domain Randomizatio… view at source ↗

**Figure 2.** Figure 2: Robot benchmark models and throughput. Left: the four robot models used in the throughput benchmarks–(1) Unitree Go1, (2) Wonik Allegro, (3) Franka Panda, and (4) CMU Humanoid. Right: batched step and forward throughput for these models; throughput saturates when the 16-thread pool becomes fully utilized. 4.4 Model-Variant Overhead When each environment owns a distinct mjModel copy (model-variant mode), ca… view at source ↗

**Figure 3.** Figure 3: Step throughput: single shared model versus per-environment model variants. 4.5 Reset Performance Sparse reset is a core feature of MuJoCoUni [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Reset latency on Go1. Left: full reset across environment counts. Right: partial reset at 4096 environments. 4.6 Batched Jacobian Performance [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Site-Jacobian computation time on Franka Emika Panda. 4.7 Height-Field Sampling Performance [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: shows sample_hfield_height performance on a stairs-terrain heightfield with a 4 × 4 sampling grid per environment. At 4096 environments, the C++ path takes 0.52 ms versus 290 ms for a Python loop—a ∼555× speedup. Stairs Terrain (height-field) 32 64 128 256 512 1024 2048 4096 Number of environments 10 1 10 0 10 1 10 2 Time (ms) Height-Field Sampling Terrain Python for-loop Python multiprocessing MuJoCoUni … view at source ↗

read the original abstract

We present MuJoCoUni, a downstream MuJoCo distribution for online robot learning and batched physics evaluation. Alongside the open-loop batched trajectory generation already provided by upstream mujoco.rollout, MuJoCoUni supplies runtime primitives for stateful environment execution. The target workloads need high-throughput parallel execution while retaining upstream CPU MuJoCo semantics for models, sensors, contact, and constraints. Its core object, BatchEnvPool, is a C++/pybind11 executor that owns per-environment mjModel copies, per-thread mjData workers, and an internal thread pool. It provides final-state-only short stepping, sparse reset, reset-lifecycle domain randomization, batched sensor forward evaluation without advancing dynamics, and batched Jacobian and height-field queries. The implementation is confined to the Python binding layer; MuJoCo's solver, contact model, integrator, and core source tree retain upstream semantics. This report describes the BatchEnvPool API, implementation boundary, relationship to rollout, and the validation and benchmark scripts shipped with the open-source mujoco-uni package, which is installed with \texttt{pip install mujoco-uni}.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MuJoCoUni adds a BatchEnvPool wrapper for stateful batched MuJoCo execution with sparse resets and non-advancing queries, all kept in the binding layer.

read the letter

The main takeaway is that this paper ships a C++/pybind11 executor called BatchEnvPool that runs multiple MuJoCo environments persistently in parallel. It supports final-state-only short steps, sparse resets, reset-linked domain randomization, batched sensor forward passes that skip dynamics, and batched Jacobian and height-field queries, all while copying mjModel per environment and using per-thread mjData with an internal thread pool.

What the work does well is stay strictly inside the Python binding layer. No changes touch the solver, contact model, or integrator, so upstream semantics for sensors, contacts, and constraints are preserved by design. The description of the API boundary and its relation to mujoco.rollout is clear, and the package includes validation and benchmark scripts that users can run themselves after a pip install.

The specific combination of persistent state, sparse reset, and batched non-advancing queries is new relative to the upstream rollout functionality. That fills a practical gap for online robot learning workloads that need high-throughput parallel execution without rewriting their models.

Soft spots are modest and mostly about scope. Batched physics simulation itself is not a new idea, and this remains an engineering layer rather than a methods advance. Performance claims rest on the shipped benchmarks, which is appropriate for a tool paper but means the report itself does not contain independent analysis of overhead from the thread pool or per-environment copies. No hidden dependencies or semantic drift appear in the stated approach.

This is for robotics researchers and engineers who already use MuJoCo for learning pipelines and need exactly these runtime primitives for parallel stateful environments. A reader building custom training loops would get direct value from the package and the implementation details.

It deserves peer review in a tools or systems venue because the implementation boundary is explicit, the code ships with checks, and the target use case is common. I would send it out rather than desk reject.

Referee Report

0 major / 3 minor

Summary. The manuscript presents MuJoCoUni, a downstream MuJoCo distribution that introduces BatchEnvPool, a C++/pybind11 executor providing runtime primitives for stateful batched environment execution (final-state-only short stepping, sparse reset, reset-lifecycle domain randomization, batched sensor forward evaluation, and batched Jacobian/height-field queries). These are intended for high-throughput parallel execution in online robot learning while retaining upstream CPU MuJoCo semantics for models, sensors, contact, and constraints. The implementation is confined to the Python binding layer using per-environment mjModel copies, per-thread mjData workers, and an internal thread pool; no changes are made to MuJoCo's solver, contact model, or core source. The paper describes the BatchEnvPool API, its relationship to mujoco.rollout, and ships validation and benchmark scripts with the mujoco-uni package.

Significance. If the implementation and semantic-retention claims hold, the work supplies practical, reusable primitives that address a common bottleneck in batched physics for robot learning without requiring modifications to MuJoCo internals. The explicit boundary description, provision of validation scripts, and open-source pip-installable package are concrete strengths that support adoption and reproducibility.

minor comments (3)

The description of the BatchEnvPool API would benefit from an explicit enumeration of all public methods and their signatures (including return types and side effects) in the main text, rather than relying solely on the shipped code examples.
The relationship to mujoco.rollout is mentioned but lacks a direct side-by-side comparison of supported workloads or performance characteristics; adding a short table or paragraph would clarify the intended division of use cases.
The benchmark scripts are referenced but no summary statistics (e.g., throughput numbers or scaling behavior) appear in the manuscript itself; including a concise results table would strengthen the performance claims without altering the tool-focused scope.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity; software implementation paper

full rationale

The manuscript is a tool release describing the BatchEnvPool API, its C++/pybind11 implementation boundary, and shipped validation scripts. It contains no equations, no fitted parameters, no derivations, and no predictions. All claims reduce to direct description of code that wraps upstream MuJoCo without altering its solver or semantics. No self-citation chains or ansatzes are present. This is the expected 0 outcome for a pure implementation report.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software engineering paper. No free parameters, mathematical axioms, or invented physical entities are introduced.

pith-pipeline@v0.9.1-grok · 5727 in / 1074 out tokens · 33497 ms · 2026-06-30T01:11:31.171237+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

UniLab: A Heterogeneous Architecture for Robot RL Beyond GPU-Dominant Paradigms
cs.RO 2026-05 unverdicted novelty 6.0

UniLab is a CPU/GPU heterogeneous system for robot RL training using MuJoCoUni and MotrixSim backends that reports 3-10x end-to-end efficiency improvements and cross-platform compatibility beyond CUDA.

Reference graph

Works this paper leans on

12 extracted references · 3 linked inside Pith · cited by 1 Pith paper

[1]

Genesis: A generative and universal physics engine for robotics and beyond, December 2024

Genesis Authors. Genesis: A generative and universal physics engine for robotics and beyond, December 2024. URLhttps://github.com/Genesis-Embodied-AI/Genesis

2024
[2]

Combining gpu and cpu for accelerating evolutionary computing workloads.arXiv preprint arXiv:2502.11129, 2025

Rustam Eynaliyev and Houcen Liu. Combining gpu and cpu for accelerating evolutionary computing workloads.arXiv preprint arXiv:2502.11129, 2025

arXiv 2025
[3]

Brax–a differentiable physics engine for large scale rigid body simulation.arXiv preprint arXiv:2106.13281, 2021

C Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem. Brax–a differentiable physics engine for large scale rigid body simulation.arXiv preprint arXiv:2106.13281, 2021

arXiv 2021
[4]

Mujoco Warp: GPU-optimized version of the mujoco physics simulator, 2025

Google DeepMind and NVIDIA. Mujoco Warp: GPU-optimized version of the mujoco physics simulator, 2025. URLhttps://github.com/google-deepmind/mujoco_warp

2025
[5]

Gs-playground: A high-throughput photorealistic simulator for vision-informed robot learning.arXiv preprint arXiv:2604.25459, 2026

Yufei Jia, Heng Zhang, Ziheng Zhang, Junzhe Wu, Mingrui Yu, Zifan Wang, Dixuan Jiang, Zheng Li, Chenyu Cao, Zhuoyuan Yu, et al. Gs-playground: A high-throughput photorealistic simulator for vision-informed robot learning.arXiv preprint arXiv:2604.25459, 2026

Pith/arXiv arXiv 2026
[6]

Isaac gym: High performance gpu-based physics simulation for robot learning.arXiv preprint arXiv:2108.10470, 2021

Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, et al. Isaac gym: High performance gpu-based physics simulation for robot learning.arXiv preprint arXiv:2108.10470, 2021. 10

Pith/arXiv arXiv 2021
[7]

Isaac lab: A gpu-accelerated simulation framework for multi-modal robot learning.arXiv preprint arXiv:2511.04831, 2025

Mayank Mittal, Pascal Roth, James Tigue, Antoine Richard, Octi Zhang, Peter Du, Antonio Serrano-Muñoz, Xinjie Yao, René Zurbrügg, Nikita Rudin, et al. Isaac lab: A gpu-accelerated simulation framework for multi-modal robot learning.arXiv preprint arXiv:2511.04831, 2025

Pith/arXiv arXiv 2025
[8]

Mujoco XLA (MJX), 2024

MuJoCo XLA Authors. Mujoco XLA (MJX), 2024. URLhttps://mujoco.readthedocs. io/en/stable/mjx.html

2024
[9]

Maniskill3: Gpu parallelized robotics simulation and rendering for generalizable embodied ai.Robotics: Science and Systems, 2025

Stone Tao, Fanbo Xiang, Arth Shukla, Yuzhe Qin, Xander Hinrichsen, Xiaodi Yuan, Chen Bao, Xinsong Lin, Yulin Liu, Tse kai Chan, Yuan Gao, Xuanlin Li, Tongzhou Mu, Nan Xiao, Arnav Gurha, Viswesh Nagaswamy Rajesh, Yong Woo Choi, Yen-Ru Chen, Zhiao Huang, Roberto Calandra, Rui Chen, Shan Luo, and Hao Su. Maniskill3: Gpu parallelized robotics simulation and r...

2025
[10]

Mujoco: A physics engine for model-based control

Emanuel Todorov, Tom Erez, and Yuval Tassa. Mujoco: A physics engine for model-based control. In2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012

2012
[11]

Envpool: A highly parallel reinforcement learning environment execution engine.Advances in Neural Information Processing Systems, 35:22409–22421, 2022

Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, et al. Envpool: A highly parallel reinforcement learning environment execution engine.Advances in Neural Information Processing Systems, 35:22409–22421, 2022

2022
[12]

Mujoco playground.arXiv preprint arXiv:2502.08844, 2025

Kevin Zakka, Baruch Tabanpour, Qiayuan Liao, Mustafa Haiderbhai, Samuel Holt, Jing Yuan Luo, Arthur Allshire, Erik Frey, Koushil Sreenath, Lueder A Kahrs, et al. Mujoco playground.arXiv preprint arXiv:2502.08844, 2025. 11

arXiv 2025

[1] [1]

Genesis: A generative and universal physics engine for robotics and beyond, December 2024

Genesis Authors. Genesis: A generative and universal physics engine for robotics and beyond, December 2024. URLhttps://github.com/Genesis-Embodied-AI/Genesis

2024

[2] [2]

Combining gpu and cpu for accelerating evolutionary computing workloads.arXiv preprint arXiv:2502.11129, 2025

Rustam Eynaliyev and Houcen Liu. Combining gpu and cpu for accelerating evolutionary computing workloads.arXiv preprint arXiv:2502.11129, 2025

arXiv 2025

[3] [3]

Brax–a differentiable physics engine for large scale rigid body simulation.arXiv preprint arXiv:2106.13281, 2021

C Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, and Olivier Bachem. Brax–a differentiable physics engine for large scale rigid body simulation.arXiv preprint arXiv:2106.13281, 2021

arXiv 2021

[4] [4]

Mujoco Warp: GPU-optimized version of the mujoco physics simulator, 2025

Google DeepMind and NVIDIA. Mujoco Warp: GPU-optimized version of the mujoco physics simulator, 2025. URLhttps://github.com/google-deepmind/mujoco_warp

2025

[5] [5]

Gs-playground: A high-throughput photorealistic simulator for vision-informed robot learning.arXiv preprint arXiv:2604.25459, 2026

Yufei Jia, Heng Zhang, Ziheng Zhang, Junzhe Wu, Mingrui Yu, Zifan Wang, Dixuan Jiang, Zheng Li, Chenyu Cao, Zhuoyuan Yu, et al. Gs-playground: A high-throughput photorealistic simulator for vision-informed robot learning.arXiv preprint arXiv:2604.25459, 2026

Pith/arXiv arXiv 2026

[6] [6]

Isaac gym: High performance gpu-based physics simulation for robot learning.arXiv preprint arXiv:2108.10470, 2021

Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, et al. Isaac gym: High performance gpu-based physics simulation for robot learning.arXiv preprint arXiv:2108.10470, 2021. 10

Pith/arXiv arXiv 2021

[7] [7]

Isaac lab: A gpu-accelerated simulation framework for multi-modal robot learning.arXiv preprint arXiv:2511.04831, 2025

Mayank Mittal, Pascal Roth, James Tigue, Antoine Richard, Octi Zhang, Peter Du, Antonio Serrano-Muñoz, Xinjie Yao, René Zurbrügg, Nikita Rudin, et al. Isaac lab: A gpu-accelerated simulation framework for multi-modal robot learning.arXiv preprint arXiv:2511.04831, 2025

Pith/arXiv arXiv 2025

[8] [8]

Mujoco XLA (MJX), 2024

MuJoCo XLA Authors. Mujoco XLA (MJX), 2024. URLhttps://mujoco.readthedocs. io/en/stable/mjx.html

2024

[9] [9]

Maniskill3: Gpu parallelized robotics simulation and rendering for generalizable embodied ai.Robotics: Science and Systems, 2025

Stone Tao, Fanbo Xiang, Arth Shukla, Yuzhe Qin, Xander Hinrichsen, Xiaodi Yuan, Chen Bao, Xinsong Lin, Yulin Liu, Tse kai Chan, Yuan Gao, Xuanlin Li, Tongzhou Mu, Nan Xiao, Arnav Gurha, Viswesh Nagaswamy Rajesh, Yong Woo Choi, Yen-Ru Chen, Zhiao Huang, Roberto Calandra, Rui Chen, Shan Luo, and Hao Su. Maniskill3: Gpu parallelized robotics simulation and r...

2025

[10] [10]

Mujoco: A physics engine for model-based control

Emanuel Todorov, Tom Erez, and Yuval Tassa. Mujoco: A physics engine for model-based control. In2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012

2012

[11] [11]

Envpool: A highly parallel reinforcement learning environment execution engine.Advances in Neural Information Processing Systems, 35:22409–22421, 2022

Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, et al. Envpool: A highly parallel reinforcement learning environment execution engine.Advances in Neural Information Processing Systems, 35:22409–22421, 2022

2022

[12] [12]

Mujoco playground.arXiv preprint arXiv:2502.08844, 2025

Kevin Zakka, Baruch Tabanpour, Qiayuan Liao, Mustafa Haiderbhai, Samuel Holt, Jing Yuan Luo, Arthur Allshire, Erik Frey, Koushil Sreenath, Lueder A Kahrs, et al. Mujoco playground.arXiv preprint arXiv:2502.08844, 2025. 11

arXiv 2025