Sim-to-Real Transfer for Muscle-Actuated Robots via Generalized Actuator Networks

Bernhard Sch\"olkopf; Dieter B\"uchler; Ingmar Posner; Jan Schneider; Le Chen; Mridul Mahajan; Simon Guist

arxiv: 2604.09487 · v2 · pith:BN4P6QA5new · submitted 2026-04-10 · 💻 cs.RO · cs.LG

Sim-to-Real Transfer for Muscle-Actuated Robots via Generalized Actuator Networks

Jan Schneider , Mridul Mahajan , Le Chen , Simon Guist , Bernhard Sch\"olkopf , Ingmar Posner , Dieter B\"uchler This is my paper

Pith reviewed 2026-05-10 16:49 UTC · model grok-4.3

classification 💻 cs.RO cs.LG

keywords sim-to-real transfergeneralized actuator networksmuscle-actuated robotspneumatic artificial musclestendon-driven systemsreinforcement learningactuator identificationrobot simulation

0 comments

The pith

Neural network models nonlinear muscle actuation from position data to enable sim-to-real policy transfer on tendon-driven robot.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows how to bridge the sim-to-real gap for muscle-actuated robots by learning a model of their complex actuation behavior. The Generalized Actuator Network is trained on joint position trajectories to capture effects like friction and hysteresis without needing torque sensors. Policies for goal reaching and ball-in-a-cup tasks are then trained in a hybrid simulation and transferred directly to the physical four-DoF PAMY2 robot. If correct, this removes a major obstacle to using simulation for developing control policies on these promising but hard-to-model systems.

Core claim

We propose the Generalized Actuator Network (GeAN) that learns an actuation model directly from joint position trajectories. When integrated with rigid body dynamics simulation, this permits training of control policies entirely in simulation that transfer successfully to the real robot, as shown by precise goal-reaching and dynamic ball-in-a-cup behaviors on a tendon-driven pneumatic muscle arm.

What carries the argument

Generalized Actuator Network (GeAN), which is a neural network trained to reproduce actuator responses from position data and substitutes for direct modeling of nonlinear muscle dynamics in the simulation pipeline.

Load-bearing premise

A neural network fitted to joint position trajectories alone can capture the nonlinear actuator dynamics including friction and hysteresis with enough fidelity to allow zero-shot transfer of policies to the real robot.

What would settle it

Running the ball-in-a-cup policy on the physical PAMY2 robot and finding that the success rate does not match simulation results, or that adaptation is required, would show the actuator model does not support adequate transfer.

Figures

Figures reproduced from arXiv: 2604.09487 by Bernhard Sch\"olkopf, Dieter B\"uchler, Ingmar Posner, Jan Schneider, Le Chen, Mridul Mahajan, Simon Guist.

**Figure 2.** Figure 2: Overview of the sim-to-real pipeline. (1) Actuator network training with the position loss, where the network is trained [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The 4-DoF muscle-actuated robot PAMY2 [3] (left) [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Position error for the GeAN trained with the position [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Visualization of the reacher task. The motions of the [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Success rates for the reacher and ball-in-a-cup policies [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Mean absolute distance between the final joint positions [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 9.** Figure 9: Mean absolute position error of the GeAN-augmented [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Reacher policy success rates when using GeANs [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 11.** Figure 11: Position error for GeANs trained with different history [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: Position error for GeANs trained with different stride [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

**Figure 13.** Figure 13: Position error for GeANs trained with different [PITH_FULL_IMAGE:figures/full_fig_p014_13.png] view at source ↗

read the original abstract

Tendon drives paired with soft muscle actuation enable faster and safer robots while potentially accelerating skill acquisition. Still, these systems are rarely used in practice due to inherent nonlinearities, friction, and hysteresis, which complicate modeling and control. So far, these challenges have hindered policy transfer from simulation to real systems. To bridge this gap, we propose a sim-to-real pipeline that learns a neural network model of this complex actuation and leverages established rigid body simulation for the arm dynamics and interactions with the environment. Our method, called Generalized Actuator Network (GenAN), enables actuation model identification across a wide range of robots by learning directly from joint position trajectories rather than requiring torque sensors. Using GenAN on PAMY2, a tendon-driven robot powered by pneumatic artificial muscles, we successfully deploy dynamic but precise goal-reaching, ball-in-a-cup, and table tennis policies, trained entirely in simulation. To the best of our knowledge, this result constitutes the first successful sim-to-real transfer for a four-degrees-of-freedom muscle-actuated robot arm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GeAN learns actuator models from position trajectories alone to support sim-to-real transfer on a 4-DoF muscle-actuated arm, with claimed success on both reaching and dynamic tasks.

read the letter

The main thing to know is that the authors present a method called GeAN that learns models of muscle actuators directly from joint position data, enabling policies trained in simulation to transfer to a real tendon-driven arm with pneumatic muscles. They report successful deployment on both reaching tasks and a dynamic ball-in-a-cup experiment, which they say is the first such sim-to-real result for a 4-DoF muscle-actuated system. This is new in the way it handles the actuator identification without torque measurements. Prior work on sim-to-real for rigid robots often assumes known models or uses different sensing. Here, the neural network approach for generalized actuators seems tailored to the nonlinearities like hysteresis that plague these systems. Combining it with off-the-shelf rigid body simulation keeps the focus on the hard part. The hardware results on PAMY2 show that the approach can produce usable policies for both precise and dynamic behaviors. It does well in making the modeling step more accessible. Collecting position trajectories is straightforward on most robots, so this could apply to other muscle or soft actuator setups. The potential issue is generalization. The stress-test concern is valid to check: training on position trajectories might not expose the model to the velocity and load conditions that arise when running a closed-loop policy on the real robot. For the ball-in-a-cup task, rapid movements could reveal gaps in the learned friction or pressure dynamics that the simulator does not fix. The paper should include checks like how well the model predicts positions under policy commands or error breakdowns across different regimes. If those are missing or weak, the transfer success might be more brittle than it appears. Overall, this paper targets robotics labs working on actuation modeling and transfer learning. Readers dealing with nonlinear actuators or wanting to avoid torque sensors will find the pipeline practical. It is worth sending to peer review. The contribution is focused and the experimental claim is strong enough to merit detailed feedback from experts in the area.

Referee Report

2 major / 1 minor

Summary. The paper introduces the Generalized Actuator Network (GeAN), a neural network trained exclusively on joint position trajectories to model nonlinear actuator dynamics (friction, hysteresis) in tendon-driven muscle-actuated robots. It integrates GeAN with rigid-body simulation to train policies entirely in simulation and reports zero-shot deployment on the physical PAMY2 4-DoF pneumatic artificial muscle robot for precise goal-reaching and dynamic ball-in-a-cup tasks, claiming this as the first successful sim-to-real transfer for such a system.

Significance. If the quantitative results hold, the work would be significant for enabling practical use of soft, nonlinear actuators in robotics by avoiding torque sensors and real-world adaptation, potentially accelerating development of faster and safer systems. The ball-in-a-cup result would be particularly notable as a falsifiable demonstration of dynamic transfer.

major comments (2)

[Abstract] Abstract and experimental results section: the claim of successful real-hardware deployment of goal-reaching and ball-in-a-cup policies provides no quantitative metrics (success rates, position errors, timing statistics), baselines, ablations, or error analysis, so the support for zero-shot transfer cannot be evaluated.
[Method (GeAN)] GeAN training and data-collection description: training uses only observed joint positions without torque or velocity labels; it is unclear whether the collected trajectories cover the velocity/load/hysteresis regimes of the closed-loop ball-in-a-cup policy, leaving open the possibility that friction and pressure dynamics are mispredicted under rapid state-dependent commands.

minor comments (1)

[Figures] Figure captions and text should explicitly state the number of real-robot trials and any failure modes observed during deployment.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important aspects of clarity and evidence in our presentation of the sim-to-real results. We address each major comment point by point below.

read point-by-point responses

Referee: [Abstract] Abstract and experimental results section: the claim of successful real-hardware deployment of goal-reaching and ball-in-a-cup policies provides no quantitative metrics (success rates, position errors, timing statistics), baselines, ablations, or error analysis, so the support for zero-shot transfer cannot be evaluated.

Authors: We agree that the abstract and experimental results would benefit from explicit quantitative support for the zero-shot transfer claims. The manuscript reports successful deployment on the physical system, but to strengthen the evidence we will revise the abstract to include key metrics and expand the results section with success rates, mean position errors, timing statistics, baseline comparisons, and error analysis for both tasks. revision: yes
Referee: [Method (GeAN)] GeAN training and data-collection description: training uses only observed joint positions without torque or velocity labels; it is unclear whether the collected trajectories cover the velocity/load/hysteresis regimes of the closed-loop ball-in-a-cup policy, leaving open the possibility that friction and pressure dynamics are mispredicted under rapid state-dependent commands.

Authors: The GeAN training data were collected from a diverse set of position trajectories on the physical robot, including motions at varying speeds and under different loads to capture nonlinear effects such as friction and hysteresis. To address the concern about regime coverage for the ball-in-a-cup policy, we will revise the method section to provide a more detailed description of the data collection protocol and include supporting analysis (e.g., velocity and load distribution statistics) demonstrating that the training trajectories encompass the operating conditions of the closed-loop policy. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the sim-to-real pipeline

full rationale

The paper describes a data-driven pipeline: collect real joint-position trajectories on the physical PAMY2 robot, train a neural network (GeAN) to model nonlinear actuator dynamics from those trajectories, insert the learned model into a rigid-body simulator, train policies in simulation, and deploy zero-shot on the real robot. No equation, definition, or claim reduces the reported transfer success to a fitted parameter by construction, nor does any load-bearing step rely on a self-citation chain that itself assumes the target result. The central claim remains an empirical demonstration whose validity is independently testable by repeating the data-collection and transfer experiments; it does not collapse into a renaming or tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claim rests on the hybrid simulation assumption and the data-driven actuator model; no explicit free parameters or new physical entities are named beyond the neural network itself.

axioms (1)

domain assumption Rigid body dynamics and environmental interactions can be accurately simulated using established physics engines once actuator forces are provided.
The method explicitly separates actuation modeling from rigid body simulation for the arm and environment.

invented entities (1)

Generalized Actuator Network (GeAN) no independent evidence
purpose: Neural network to model complex nonlinear actuator behavior from position trajectories.
Core new component introduced to bridge the actuation modeling gap; no independent evidence outside the reported experiments is described.

pith-pipeline@v0.9.0 · 5499 in / 1440 out tokens · 72269 ms · 2026-05-10T16:49:19.078419+00:00 · methodology

Sim-to-Real Transfer for Muscle-Actuated Robots via Generalized Actuator Networks

Core claim

What carries the argument

Load-bearing premise

What would settle it

discussion (0)