Local Online Motor Babbling: Learning Motor Abundance of A Musculoskeletal Robot Arm

Arne Hitzmann; Jan Peters; Koh Hosoda; Shuhei Ikemoto; Svenja Stark; Zinan Liu

arxiv: 1906.09013 · v1 · pith:BMITR72Nnew · submitted 2019-06-21 · 💻 cs.RO

Local Online Motor Babbling: Learning Motor Abundance of A Musculoskeletal Robot Arm

Zinan Liu , Arne Hitzmann , Shuhei Ikemoto , Svenja Stark , Jan Peters , Koh Hosoda This is my paper

Pith reviewed 2026-05-25 19:06 UTC · model grok-4.3

classification 💻 cs.RO

keywords motor babblinggoal babblingmotor abundancemusculoskeletal robotinverse kinematicsCMA-ESsensorimotor learningmuscle synergy

0 comments

The pith

Directed goal babbling followed by local CMA-ES motor babbling lets a 10-DoF musculoskeletal arm learn inverse kinematics and query multiple motor solutions for any goal.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to first use simple heuristics to define a goal space and apply directed goal babbling to learn the inverse kinematics of a redundant musculoskeletal robot arm. It then introduces local online motor babbling that runs CMA-ES starting from the collected samples, allowing the system to find different muscle activation patterns that reach the same static goal. This treats motor redundancy as motor abundance that can be explored on demand rather than avoided. A sympathetic reader would care because it offers a way to generate and inspect the many muscle solutions that exist in soft, high-DoF systems and to extract patterns such as stiffness and synergy from them.

Core claim

By first learning the inverse kinematics through directed goal babbling on an empirically defined goal space and then applying local online motor babbling initialized with CMA-ES on collected samples, the method enables querying motor abundance for static goals, revealing insights into muscle stiffness and synergy in a 10 DoF arm.

What carries the argument

local online motor babbling using Covariance Matrix Adaptation Evolution Strategy (CMA-ES) bootstrapped on goal babbling samples

If this is right

Motor abundance can be queried for any static goal within the defined goal space.
The bootstrapped CMA-ES search efficiently explores redundant motor solutions without starting from scratch.
The collected activation patterns yield concrete observations about muscle stiffness and synergy.
The two-stage process separates learning the basic mapping from exploring its redundant realizations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same staged approach might be tested on dynamic goals by feeding the discovered abundance patterns into a trajectory planner.
Synergy patterns extracted this way could be compared directly against recorded human muscle data for the same reaching tasks.
If the heuristics for goal-space definition prove stable across different arm morphologies, the method could transfer to other high-redundancy soft robots without redesign.

Load-bearing premise

Simple heuristics can empirically define the unknown goal space in a way that supports both inverse kinematics learning and subsequent motor abundance exploration via CMA-ES.

What would settle it

If CMA-ES runs on the goal-babbling samples fail to return multiple distinct muscle activation vectors that all reach the same goal position, or if the returned activations show no measurable variation in stiffness or synergy structure.

Figures

Figures reproduced from arXiv: 1906.09013 by Arne Hitzmann, Jan Peters, Koh Hosoda, Shuhei Ikemoto, Svenja Stark, Zinan Liu.

**Figure 2.** Figure 2: Empirical goal space XE(in blue) sampled from 2000 random postures, and the convex goal space XC(in red), which is used for learning as shown in [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 1.** Figure 1: The control accuracy of the robot is tested according [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 3.** Figure 3: Decreasing performance error up to 20000 samples, i.e., the average [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Performance error distribution of the convex goal space [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 8.** Figure 8: One evolution trial for goal 44, the search of the step-size increases [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 7.** Figure 7: Comparing the reaching error and muscle variability of directed [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 9.** Figure 9: Comparing baseline and CMA-ES covariances, where the largest [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

read the original abstract

Motor babbling and goal babbling has been used for sensorimotor learning of highly redundant systems in soft robotics. Recent works in goal babbling has demonstrated successful learning of inverse kinematics (IK) on such systems, and suggests that babbling in the goal space better resolves motor redundancy by learning as few sensorimotor mapping as possible. However, for musculoskeletal robot systems, motor redundancy can be of useful information to explain muscle activation patterns, thus the term motor abundance. In this work, we introduce some simple heuristics to empirically define the unknown goal space, and learn the inverse kinematics of a 10 DoF musculoskeletal robot arm using directed goal babbling. We then further propose local online motor babbling using Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which bootstraps on the collected samples in goal babbling for initialization, such that motor abundance can be queried for any static goal within the defined goal space. The result shows that our motor babbling approach can efficiently explore motor abundance, and gives useful insights in terms of muscle stiffness and synergy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies goal babbling plus CMA-ES to query motor abundance on a 10-DoF musculoskeletal arm, but the abstract gives almost no experimental evidence to back the efficiency or insight claims.

read the letter

The core contribution is a two-stage procedure on a real 10-DoF musculoskeletal arm: first directed goal babbling with simple heuristics to learn inverse kinematics, then CMA-ES initialized from those samples to perform local online motor babbling and explore abundance around static goals. The bootstrapping step and the focus on muscle-level abundance rather than just redundancy resolution are the parts that feel like a legitimate incremental step for this hardware class. The abstract also claims the method yields insights on stiffness and synergy, which is the kind of downstream payoff people in soft robotics actually care about. That said, the abstract contains no numbers, no baselines, no error bars, and no description of how the goal-space heuristics were chosen or tested. Without those, it is impossible to tell whether the reported efficiency is real or whether the heuristics simply carve out a convenient subspace that makes both stages look good. The stress-test concern about possible bias in the goal-space definition therefore lands; if the full paper does not include coverage checks or sensitivity runs on the heuristics, the central claims rest on an unexamined modeling choice. This work is aimed at researchers who already work on redundant musculoskeletal or soft robots and who need concrete ways to sample abundance rather than just resolve it. A reader already familiar with goal babbling and CMA-ES will see the combination quickly, but the lack of quantitative detail makes it hard to judge how much new ground is actually broken. I would send it to peer review so the experimental section can be examined; the idea is narrow but the hardware is non-trivial and the bootstrapping trick is worth checking.

Referee Report

2 major / 0 minor

Summary. The manuscript claims that simple heuristics can empirically define the unknown goal space of a 10-DoF musculoskeletal robot arm, enabling directed goal babbling to learn inverse kinematics; a subsequent local online motor babbling procedure using CMA-ES (bootstrapped on the collected samples) then allows efficient querying of motor abundance for any static goal within that space, yielding insights into muscle stiffness and synergy.

Significance. If the heuristics are shown not to introduce bias and the efficiency claims are validated with proper controls, the work could contribute a practical method for exploring motor redundancy in soft robotic systems and relating it to biological motor abundance concepts.

major comments (2)

[Abstract] Abstract: the central efficiency and insight claims ('can efficiently explore motor abundance, and gives useful insights in terms of muscle stiffness and synergy') are unsupported by any experimental details, error bars, baselines, or validation metrics, preventing assessment of whether the results hold.
[Abstract] Abstract: the 'simple heuristics to empirically define the unknown goal space' are introduced without derivation, coverage argument, or sensitivity analysis showing that alternative definitions would produce equivalent IK learning or CMA-ES abundance results; this is load-bearing for the claim that the method avoids artifacts in both stages.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We agree that the abstract would benefit from clearer linkage to the experimental evidence and additional justification for the goal-space heuristics. We address each comment below and will incorporate revisions in the next version of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central efficiency and insight claims ('can efficiently explore motor abundance, and gives useful insights in terms of muscle stiffness and synergy') are unsupported by any experimental details, error bars, baselines, or validation metrics, preventing assessment of whether the results hold.

Authors: We acknowledge that the abstract is highly condensed and does not itself contain error bars, baselines, or quantitative metrics. The full manuscript presents these in the results section through figures comparing sample efficiency of the CMA-ES procedure against random sampling baselines, with plotted means and standard deviations across multiple runs, plus qualitative analysis of muscle activation patterns for stiffness and synergy. To address the concern, we will revise the abstract to include one or two concrete indicators of the reported efficiency (e.g., sample counts required for stable abundance queries) while remaining within length limits. revision: yes
Referee: [Abstract] Abstract: the 'simple heuristics to empirically define the unknown goal space' are introduced without derivation, coverage argument, or sensitivity analysis showing that alternative definitions would produce equivalent IK learning or CMA-ES abundance results; this is load-bearing for the claim that the method avoids artifacts in both stages.

Authors: The heuristics are described in the methods as empirical bounds derived from the robot's reachable workspace and joint limits; the manuscript shows that directed goal babbling within these bounds successfully learns IK. We agree that a formal coverage argument and sensitivity study to alternative bounds would strengthen the claim that results are not artifacts. We will add a short paragraph in the methods or discussion section providing the rationale for the chosen bounds and a brief sensitivity check using one alternative definition. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical heuristics and experimental results are self-contained

full rationale

The paper introduces simple heuristics to define an unknown goal space for directed goal babbling on a musculoskeletal arm, then applies CMA-ES for local motor abundance queries. No equations, derivations, or first-principles claims are present in the provided text. The approach is explicitly empirical, with results reported from robot experiments rather than any reduction of outputs to fitted inputs or self-citations by construction. The central claims rest on observed efficiency and insights from data collection, not on any loop where a prediction equals its own definition. This is a standard non-circular empirical robotics paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities can be extracted beyond the stated heuristics for goal-space definition.

axioms (1)

domain assumption Simple heuristics suffice to empirically define the unknown goal space for directed goal babbling.
Stated directly in the abstract as the basis for learning IK.

pith-pipeline@v0.9.0 · 5726 in / 1078 out tokens · 19827 ms · 2026-05-25T19:06:23.898862+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 1 internal anchor

[1]

R. P. Paul, Robot manipulators: mathematics, programming, and control: the computer control of robot manipulators . Richard Paul, 1981

work page 1981
[2]

Model learning for robot control: a survey,

D. Nguyen-Tuong and J. Peters, “Model learning for robot control: a survey,” Cognitive processing, vol. 12, no. 4, pp. 319–340, 2011

work page 2011
[3]

Efﬁcient exploratory learning of inverse kinematics on a bionic elephant trunk,

M. Rolf and J. J. Steil, “Efﬁcient exploratory learning of inverse kinematics on a bionic elephant trunk,” IEEE transactions on neural networks and learning systems , vol. 25, no. 6, pp. 1147–1160, 2014

work page 2014
[4]

Anthropomorphic muscular–skeletal robotic upper limb for under- standing embodied intelligence,

K. Hosoda, S. Sekimoto, Y . Nishigori, S. Takamuku, and S. Ikemoto, “Anthropomorphic muscular–skeletal robotic upper limb for under- standing embodied intelligence,” Advanced Robotics , vol. 26, no. 7, pp. 729–744, 2012

work page 2012
[5]

Biomechanical ap- proach to open-loop bipedal running with a musculoskeletal athlete robot,

R. Niiyama, S. Nishikawa, and Y . Kuniyoshi, “Biomechanical ap- proach to open-loop bipedal running with a musculoskeletal athlete robot,” Advanced Robotics, vol. 26, no. 3-4, pp. 383–398, 2012

work page 2012
[6]

Anthropo- morphic musculoskeletal 10 degrees-of-freedom robot arm driven by pneumatic artiﬁcial muscles,

A. Hitzmann, H. Masuda, S. Ikemoto, and K. Hosoda, “Anthropo- morphic musculoskeletal 10 degrees-of-freedom robot arm driven by pneumatic artiﬁcial muscles,” Advanced Robotics, vol. 32, no. 15, pp. 865–878, 2018

work page 2018
[7]

Vector associative maps: Unsupervised real-time error-based learning and control of movement trajectories,

P. Gaudiano and S. Grossberg, “Vector associative maps: Unsupervised real-time error-based learning and control of movement trajectories,” Neural networks, vol. 4, no. 2, pp. 147–183, 1991

work page 1991
[8]

From motor babbling to hierarchical learning by imitation: a robot developmental pathway,

Y . Demiris and A. Dearden, “From motor babbling to hierarchical learning by imitation: a robot developmental pathway,” 2005

work page 2005
[9]

Learning inverse kine- matics,

A. D’Souza, S. Vijayakumar, and S. Schaal, “Learning inverse kine- matics,” in Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No. 01CH37180) , vol. 1. IEEE, 2001, pp. 298–303

work page 2001
[10]

Active learning of inverse models with intrinsically motivated goal exploration in robots,

A. Baranes and P.-Y . Oudeyer, “Active learning of inverse models with intrinsically motivated goal exploration in robots,” Robotics and Autonomous Systems, vol. 61, no. 1, pp. 49–73, 2013

work page 2013
[11]

Online goal babbling for rapid bootstrapping of inverse models in high dimensions,

M. Rolf, J. J. Steil, and M. Gienger, “Online goal babbling for rapid bootstrapping of inverse models in high dimensions,” in Development and Learning (ICDL), 2011 IEEE International Conference on , vol. 2. IEEE, 2011, pp. 1–8

work page 2011
[12]

There is no motor redundancy in human movements. there is motor abundance,

M. Latash, “There is no motor redundancy in human movements. there is motor abundance,” 2000

work page 2000
[13]

The bliss (not the problem) of motor abundance (not redundancy),

M. L. Latash, “The bliss (not the problem) of motor abundance (not redundancy),” Experimental brain research , vol. 217, no. 1, pp. 1–5, 2012

work page 2012
[14]

Exploration of joint redundancy but not task space variability facilitates supervised motor learning,

P. Singh, S. Jana, A. Ghosal, and A. Murthy, “Exploration of joint redundancy but not task space variability facilitates supervised motor learning,” Proceedings of the National Academy of Sciences , vol. 113, no. 50, pp. 14 414–14 419, 2016

work page 2016
[15]

The self-organizing map,

T. Kohonen, “The self-organizing map,” Proceedings of the IEEE , vol. 78, no. 9, pp. 1464–1480, 1990

work page 1990
[16]

The CMA Evolution Strategy: A Tutorial

N. Hansen, “The cma evolution strategy: A tutorial,” arXiv preprint arXiv:1604.00772, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[17]

CMA-ES/pycma on Github,

N. Hansen, Y . Akimoto, and P. Baudis, “CMA-ES/pycma on Github,” Zenodo, DOI:10.5281/zenodo.2559634, Feb. 2019. [Online]. Available: https://doi.org/10.5281/zenodo.2559634

work page doi:10.5281/zenodo.2559634 2019
[18]

C. M. Bishop, Pattern recognition and machine learning . springer, 2006

work page 2006
[19]

Muscle synergy organization is robust across a variety of postural perturbations,

G. Torres-Oviedo, J. M. Macpherson, and L. H. Ting, “Muscle synergy organization is robust across a variety of postural perturbations,” Journal of neurophysiology, 2006

work page 2006
[20]

The case for and against muscle synergies,

M. C. Tresch and A. Jarc, “The case for and against muscle synergies,” Current opinion in neurobiology , vol. 19, no. 6, pp. 601–607, 2009

work page 2009

[1] [1]

R. P. Paul, Robot manipulators: mathematics, programming, and control: the computer control of robot manipulators . Richard Paul, 1981

work page 1981

[2] [2]

Model learning for robot control: a survey,

D. Nguyen-Tuong and J. Peters, “Model learning for robot control: a survey,” Cognitive processing, vol. 12, no. 4, pp. 319–340, 2011

work page 2011

[3] [3]

Efﬁcient exploratory learning of inverse kinematics on a bionic elephant trunk,

M. Rolf and J. J. Steil, “Efﬁcient exploratory learning of inverse kinematics on a bionic elephant trunk,” IEEE transactions on neural networks and learning systems , vol. 25, no. 6, pp. 1147–1160, 2014

work page 2014

[4] [4]

Anthropomorphic muscular–skeletal robotic upper limb for under- standing embodied intelligence,

K. Hosoda, S. Sekimoto, Y . Nishigori, S. Takamuku, and S. Ikemoto, “Anthropomorphic muscular–skeletal robotic upper limb for under- standing embodied intelligence,” Advanced Robotics , vol. 26, no. 7, pp. 729–744, 2012

work page 2012

[5] [5]

Biomechanical ap- proach to open-loop bipedal running with a musculoskeletal athlete robot,

R. Niiyama, S. Nishikawa, and Y . Kuniyoshi, “Biomechanical ap- proach to open-loop bipedal running with a musculoskeletal athlete robot,” Advanced Robotics, vol. 26, no. 3-4, pp. 383–398, 2012

work page 2012

[6] [6]

Anthropo- morphic musculoskeletal 10 degrees-of-freedom robot arm driven by pneumatic artiﬁcial muscles,

A. Hitzmann, H. Masuda, S. Ikemoto, and K. Hosoda, “Anthropo- morphic musculoskeletal 10 degrees-of-freedom robot arm driven by pneumatic artiﬁcial muscles,” Advanced Robotics, vol. 32, no. 15, pp. 865–878, 2018

work page 2018

[7] [7]

Vector associative maps: Unsupervised real-time error-based learning and control of movement trajectories,

P. Gaudiano and S. Grossberg, “Vector associative maps: Unsupervised real-time error-based learning and control of movement trajectories,” Neural networks, vol. 4, no. 2, pp. 147–183, 1991

work page 1991

[8] [8]

From motor babbling to hierarchical learning by imitation: a robot developmental pathway,

Y . Demiris and A. Dearden, “From motor babbling to hierarchical learning by imitation: a robot developmental pathway,” 2005

work page 2005

[9] [9]

Learning inverse kine- matics,

A. D’Souza, S. Vijayakumar, and S. Schaal, “Learning inverse kine- matics,” in Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No. 01CH37180) , vol. 1. IEEE, 2001, pp. 298–303

work page 2001

[10] [10]

Active learning of inverse models with intrinsically motivated goal exploration in robots,

A. Baranes and P.-Y . Oudeyer, “Active learning of inverse models with intrinsically motivated goal exploration in robots,” Robotics and Autonomous Systems, vol. 61, no. 1, pp. 49–73, 2013

work page 2013

[11] [11]

Online goal babbling for rapid bootstrapping of inverse models in high dimensions,

M. Rolf, J. J. Steil, and M. Gienger, “Online goal babbling for rapid bootstrapping of inverse models in high dimensions,” in Development and Learning (ICDL), 2011 IEEE International Conference on , vol. 2. IEEE, 2011, pp. 1–8

work page 2011

[12] [12]

There is no motor redundancy in human movements. there is motor abundance,

M. Latash, “There is no motor redundancy in human movements. there is motor abundance,” 2000

work page 2000

[13] [13]

The bliss (not the problem) of motor abundance (not redundancy),

M. L. Latash, “The bliss (not the problem) of motor abundance (not redundancy),” Experimental brain research , vol. 217, no. 1, pp. 1–5, 2012

work page 2012

[14] [14]

Exploration of joint redundancy but not task space variability facilitates supervised motor learning,

P. Singh, S. Jana, A. Ghosal, and A. Murthy, “Exploration of joint redundancy but not task space variability facilitates supervised motor learning,” Proceedings of the National Academy of Sciences , vol. 113, no. 50, pp. 14 414–14 419, 2016

work page 2016

[15] [15]

The self-organizing map,

T. Kohonen, “The self-organizing map,” Proceedings of the IEEE , vol. 78, no. 9, pp. 1464–1480, 1990

work page 1990

[16] [16]

The CMA Evolution Strategy: A Tutorial

N. Hansen, “The cma evolution strategy: A tutorial,” arXiv preprint arXiv:1604.00772, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[17] [17]

CMA-ES/pycma on Github,

N. Hansen, Y . Akimoto, and P. Baudis, “CMA-ES/pycma on Github,” Zenodo, DOI:10.5281/zenodo.2559634, Feb. 2019. [Online]. Available: https://doi.org/10.5281/zenodo.2559634

work page doi:10.5281/zenodo.2559634 2019

[18] [18]

C. M. Bishop, Pattern recognition and machine learning . springer, 2006

work page 2006

[19] [19]

Muscle synergy organization is robust across a variety of postural perturbations,

G. Torres-Oviedo, J. M. Macpherson, and L. H. Ting, “Muscle synergy organization is robust across a variety of postural perturbations,” Journal of neurophysiology, 2006

work page 2006

[20] [20]

The case for and against muscle synergies,

M. C. Tresch and A. Jarc, “The case for and against muscle synergies,” Current opinion in neurobiology , vol. 19, no. 6, pp. 601–607, 2009

work page 2009