Recognition: unknown
Trajectory-based actuator identification via differentiable simulation
Pith reviewed 2026-05-10 15:18 UTC · model grok-4.3
The pith
Differentiable simulation recovers accurate actuator models from joint trajectories without torque sensing.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We formulate actuator identification as an optimization problem that minimizes the mismatch between simulated and measured joint trajectories by differentiating through the simulator. This torque-sensor-free procedure recovers parameters for a high-gear-ratio actuator that achieve a mean absolute position error of 7.54 mrad on held-out real trajectories, compared to 14.20 mrad for a supervised baseline. When these models are used to train locomotion policies, the robot travels 46% farther with 75% less rotational deviation.
What carries the argument
Gradient-based optimization of actuator parameters by backpropagating through a differentiable dynamics simulator, using only position and velocity trajectory data.
If this is right
- Reduces mean absolute position error on held-out trajectories from 14.20 mrad to 7.54 mrad.
- Increases travel distance by 46% in real-robot locomotion experiments.
- Reduces rotational deviation by 75% relative to policies trained with baseline actuator models.
- Works with both structured parameterizations and neural actuator mappings in one pipeline.
- Requires no torque sensors, current measurements, or internal controller access.
Where Pith is reading between the lines
- This method might allow continuous online adaptation of actuator models during robot operation using only its normal movements.
- Similar differentiable identification could apply to other components like sensors or linkages from motion data.
- Combining this with reinforcement learning could create end-to-end trainable simulation environments that improve policy transfer.
- The reduction in error suggests that trajectory diversity matters more than steady-state torque data for capturing dynamic actuator behavior.
Load-bearing premise
Errors in simulated joint trajectories are sufficient to uniquely recover the true actuator dynamics without direct torque or internal state measurements.
What would settle it
Finding a set of trajectories where the identified model matches motion well but diverges significantly when tested against independent torque measurements on the same actuator would falsify the uniqueness of recovery from motion alone.
Figures
read the original abstract
Accurate actuation models are critical for bridging the gap between simulation and real robot behavior, yet obtaining high-fidelity actuator dynamics typically requires dedicated test stands and torque sensing. We present a trajectory-based actuator identification method that uses differentiable simulation to fit system-level actuator models from encoder motion alone. Identification is posed as a trajectory-matching problem: given commanded joint positions and measured joint angles and velocities, we optimize actuator and simulator parameters by backpropagating through the simulator, without torque sensors, current/voltage measurements, or access to embedded motor-control internals. The framework supports multiple model classes, ranging from compact structured parameterizations to neural actuator mappings, within a unified optimization pipeline. On held-out real-robot trajectories for a high-gear-ratio actuator with an embedded PD controller, the proposed torque-sensor-free identification achieves much tighter trajectory alignment than a supervised stand-trained baseline dominated by steady-state data, reducing mean absolute position error from 14.20 mrad to as low as 7.54 mrad (1.88 times). Finally, we demonstrate downstream impact for the same actuator class in a real-robot locomotion study: training policies with the refined actuator model increases travel distance by 46% and reduces rotational deviation by 75% relative to the baseline.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a trajectory-based actuator identification technique that leverages differentiable simulation to optimize actuator and simulator parameters by minimizing the discrepancy between commanded and observed joint trajectories using only encoder data. It reports improved held-out trajectory matching for a high-gear-ratio actuator with embedded PD control, reducing mean absolute position error from 14.20 mrad to 7.54 mrad, and shows that policies trained with the identified model achieve 46% greater travel distance and 75% less rotational deviation in real-robot locomotion experiments.
Significance. Should the approach reliably recover actuator dynamics rather than merely overfitting to specific closed-loop trajectories, it would offer a practical alternative to sensor-intensive identification methods, facilitating more accurate simulation-based policy training without specialized hardware. The inclusion of downstream real-world validation strengthens the potential impact for robotics applications involving sim-to-real transfer.
major comments (2)
- [Abstract] Abstract: The central claim of torque-sensor-free 'actuator identification' rests on trajectory matching via backpropagation through the simulator. However, for high-gear-ratio actuators with embedded PD controllers, the optimization of parameters (friction, stiffness, damping, effective gains) to match position/velocity trajectories alone does not establish uniqueness; multiple parameter sets can reproduce the same closed-loop behavior, so the held-out error reduction (14.20 mrad to 7.54 mrad) demonstrates improved fitting but not recovery of true open-loop dynamics.
- [§4 (Experiments)] §4 (Experiments): The comparison to the supervised stand-trained baseline is undermined by the baseline being 'dominated by steady-state data'; without explicit details on data splitting, trajectory distribution matching, or whether the baseline had access to equivalent dynamic trajectories, it is unclear whether the reported gains are attributable to the differentiable method or to differences in training data coverage.
minor comments (2)
- [Abstract] Abstract and §3 (Method): The manuscript does not report statistical significance (e.g., standard errors or p-values) for the error reductions or the 46%/75% policy improvements, nor does it specify convergence criteria or initialization strategy for the parameter optimization.
- [§3 (Method)] §3 (Method): While the framework is said to support multiple model classes, the exact parameterization of the structured models (e.g., how friction and damping terms are defined) and the neural mapping architecture are not detailed enough for full reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments. We address each major comment point-by-point below, clarifying the scope of our claims and committing to revisions that strengthen the manuscript's presentation of data details and limitations.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of torque-sensor-free 'actuator identification' rests on trajectory matching via backpropagation through the simulator. However, for high-gear-ratio actuators with embedded PD controllers, the optimization of parameters (friction, stiffness, damping, effective gains) to match position/velocity trajectories alone does not establish uniqueness; multiple parameter sets can reproduce the same closed-loop behavior, so the held-out error reduction (14.20 mrad to 7.54 mrad) demonstrates improved fitting but not recovery of true open-loop dynamics.
Authors: We agree that the optimization yields parameters that reproduce observed closed-loop trajectories and does not guarantee uniqueness or recovery of the underlying open-loop dynamics; this is an inherent limitation of torque-sensor-free identification from position/velocity data alone. Our contribution focuses on practical sim-to-real utility: the resulting model improves held-out trajectory prediction and, more importantly, yields policies with 46% greater real-world travel distance. We will revise the abstract and §1 to explicitly state that the method targets closed-loop trajectory fidelity rather than claiming recovery of true open-loop parameters. revision: yes
-
Referee: [§4 (Experiments)] §4 (Experiments): The comparison to the supervised stand-trained baseline is undermined by the baseline being 'dominated by steady-state data'; without explicit details on data splitting, trajectory distribution matching, or whether the baseline had access to equivalent dynamic trajectories, it is unclear whether the reported gains are attributable to the differentiable method or to differences in training data coverage.
Authors: We acknowledge that additional details are required to substantiate the baseline comparison. The stand-trained baseline used data from a dedicated test stand that included dynamic trajectories, but the collection protocol resulted in a distribution skewed toward steady-state conditions. Our method leverages a wider range of operational dynamic trajectories. In the revision we will expand §4 with explicit descriptions of data collection protocols, train/validation splits, trajectory statistics (e.g., velocity histograms), and coverage metrics for both datasets to enable direct assessment of whether performance differences arise from the identification method or data distribution. revision: yes
Circularity Check
No circularity: empirical trajectory fitting with held-out validation
full rationale
The paper frames actuator identification as an optimization problem that minimizes position/velocity trajectory mismatch by back-propagating through a differentiable simulator. Parameters are fitted on training trajectories and evaluated on held-out real-robot data, with further downstream policy training results reported on physical hardware. This is a standard data-driven fitting procedure whose reported error reductions (e.g., 14.20 mrad to 7.54 mrad) and locomotion improvements are measured against an external baseline on independent test trajectories. No step reduces by definition to its own inputs, no fitted quantity is relabeled as a prediction, and no load-bearing self-citation chain is invoked. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- actuator model parameters
- simulator parameters
axioms (2)
- domain assumption The simulation dynamics are differentiable with respect to actuator and simulator parameters
- domain assumption Observed joint trajectories contain sufficient excitation to identify the actuator dynamics
Reference graph
Works this paper leans on
-
[1]
Di Carlo, P
J. Di Carlo, P . M. Wensing, B. Katz, G. Bledt, and S. Kim, ‘‘Dynamic loco- motion in the mit cheetah 3 through convex model-predictive control,’’ in 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2018, pp. 1–9
2018
- [2]
-
[3]
Gaertner, M
M. Gaertner, M. Bjelonic, F. Farshidian, and M. Hutter, ‘‘Collision-free mpc for legged robots in static and dynamic scenes,’’ in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 8266–8272
2021
-
[4]
Hwangbo, J
J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V . Tsounis, V . Koltun, and M. Hutter, ‘‘Learning agile and dynamic motor skills for legged robots,’’Science Robotics, vol. 4, no. 26, p. eaau5872, 2019
2019
-
[5]
James, P
S. James, P . Wohlhart, M. Kalakrishnan, D. Kalashnikov, A. Irpan, J. Ibarz, S. Levine, R. Hadsell, and K. Bousmalis, ‘‘Sim-to-real via sim-to-sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks,’’ inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 627–12 637
2019
-
[6]
Aljalbout, J
E. Aljalbout, J. Xing, A. Romero, I. Akinola, C. R. Garrett, E. Heiden, A. Gupta, T. Hermans, Y . Narang, D. Foxet al., ‘‘The reality gap in robotics: Challenges, solutions, and best practices,’’Annual Review of Control, Robotics, and Autonomous Systems, vol. 9, 2025
2025
-
[7]
Z. Xie, P . Gergondet, F. Kanehiroet al., ‘‘Learning bipedal walking for humanoids with current feedback,’’IEEE Access, vol. 11, pp. 82 013– 82 023, 2023
2023
-
[8]
Siekmann, Y
J. Siekmann, Y . Godse, A. Fern, and J. Hurst, ‘‘Sim-to-real learning of all common bipedal gaits via periodic reward composition,’’ in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 7309–7315
2021
-
[9]
Rodriguez and S
D. Rodriguez and S. Behnke, ‘‘Deepwalk: Omnidirectional bipedal gait by deep reinforcement learning,’’ in2021 IEEE international conference on robotics and automation (ICRA). IEEE, 2021, pp. 3033–3039
2021
-
[10]
Z. Li, X. B. Peng, P . Abbeel, S. Levine, G. Berseth, and K. Sreenath, ‘‘Reinforcement learning for versatile, dynamic, and robust bipedal lo- comotion control,’’The International Journal of Robotics Research, p. 02783649241285161, 2024
2024
-
[11]
Ravichandar, L
P . Ravichandar, L. Krishna, N. Sobanbabu, and Q. Nguyen, ‘‘Pref- erenced oracle guided multi-mode policies for dynamic bipedal loco- manipulation,’’ in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 6600–6606
2025
- [12]
-
[13]
Rudin, D
N. Rudin, D. Hoeller, P . Reist, and M. Hutter, ‘‘Learning to walk in minutes using massively parallel deep reinforcement learning,’’ inConference on Robot Learning. PMLR, 2022, pp. 91–100
2022
-
[14]
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
V . Makoviychuk, L. Wawrzyniak, Y . Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handaet al., ‘‘Isaac gym: High per- formance gpu-based physics simulation for robot learning,’’arXiv preprint arXiv:2108.10470, 2021
work page internal anchor Pith review arXiv 2021
-
[15]
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning
M. Mittal, P . Roth, J. Tigue, A. Richard, O. Zhang, P . Du, A. Serrano- Muñoz, X. Y ao, R. Zurbrügg, N. Rudin, L. Wawrzyniak, M. Rakhsha, A. Denzler, E. Heiden, A. Borovicka, O. Ahmed, I. Akinola, A. Anwar, M. T. Carlson, J. Y . Feng, A. Garg, R. Gasoto, L. Gulich, Y . Guo, M. Gussert, A. Hansen, M. Kulkarni, C. Li, W. Liu, V . Makoviychuk, G. Malczyk, H...
work page internal anchor Pith review arXiv 2025
-
[16]
Schmidt, T
A. Schmidt, T. Gumpert, S. Schreiber, and A. Albu-Schäffer, ‘‘Practical approach to characterize realistic motor dynamics for robotic simulation independent of the use case,’’ in2022 IEEE/ASME International Confer- ence on Advanced Intelligent Mechatronics (AIM). IEEE, 2022, pp. 1144– 1151
2022
-
[17]
A. C. Bittencourt, E. Wernholt, S. Sander-Tavallaey, and T. Brogårdh, ‘‘An extended friction model to capture load and temperature effects in robot joints,’’ in2010 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 2010, pp. 6161–6167
2010
-
[18]
Wolf and M
S. Wolf and M. Iskandar, ‘‘Extending a dynamic friction model with nonlinear viscous and thermal dependency for a motor and harmonic drive gear,’’ in2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 783–790
2018
-
[19]
F. Wang, Z. Zhang, X. Mei, J. Rodríguez, and R. Kennel, ‘‘Advanced con- trol strategies of induction machine: Field oriented control, direct torque control and model predictive control,’’energies, vol. 11, no. 1, p. 120, 2018
2018
-
[20]
Zhang, B
Y . Zhang, B. Xia, H. Y ang, and J. Rodriguez, ‘‘Overview of model pre- dictive control for induction motor drives,’’Chinese Journal of Electrical Engineering, vol. 2, no. 1, pp. 62–76, 2016
2016
-
[21]
Martyr and M
A. Martyr and M. Plint, ‘‘8 - dynamometers and the measurement of torque,’’ inEngine Testing (Third Edition), third edition ed., A. Martyr and M. Plint, Eds. Oxford: Butterworth-Heinemann, 2007, pp. 144–
2007
-
[22]
Available: https://www.sciencedirect.com/science/article/ pii/B9780750684392500116
[Online]. Available: https://www.sciencedirect.com/science/article/ pii/B9780750684392500116
-
[23]
Sziki, A
G. Sziki, A. Szanto, J. Kiss, G. Juhasz, and E. Adamko, ‘‘Measurement system for the experimental study and testing of electric motors at the faculty of engineering,’’University of Debrecen. Applied Sciences, 12 (19), pp. 1–18, 2022
2022
-
[24]
Lee and D.-K
T.-W. Lee and D.-K. Hong, ‘‘Performance validation of high-speed motor for electric turbochargers using various test methods,’’Electronics, vol. 12, no. 13, p. 2937, 2023
2023
-
[25]
arXiv preprint arXiv:1910.07113 , year=
I. Akkaya, M. Andrychowicz, M. Chociej, M. Litwin, B. McGrew, A. Petron, A. Paino, M. Plappert, G. Powell, R. Ribaset al., ‘‘Solving rubik’s cube with a robot hand,’’arXiv preprint arXiv:1910.07113, 2019
- [26]
-
[27]
Haarnoja, B
T. Haarnoja, B. Moran, G. Lever, S. H. Huang, D. Tirumala, J. Hump- lik, M. Wulfmeier, S. Tunyasuvunakool, N. Y . Siegel, R. Hafneret al., ‘‘Learning agile soccer skills for a bipedal robot with deep reinforcement learning,’’Science Robotics, vol. 9, no. 89, p. eadi8022, 2024
2024
-
[28]
Duclusaud, G
M. Duclusaud, G. Passault, V . Padois, and O. Ly, ‘‘Extended friction models for the physics simulation of servo actuators,’’ in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 12 091–12 097
2025
-
[29]
C. D. Freeman, E. Frey, A. Raichuk, S. Girgin, I. Mordatch, and O. Bachem, ‘‘Brax - a differentiable physics engine for large scale rigid body simulation,’’ 2021. [Online]. Available: http://github.com/google/brax
2021
-
[30]
Heiden, D
E. Heiden, D. Millard, E. Coumans, Y . Sheng, and G. S. Sukhatme, ‘‘NeuralSim: Augmenting differentiable simulators with neural networks,’’ inProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2021. [Online]. Available: https://github.com/ google-research/tiny-differentiable-simulator VOLUME 11, 2023 13 Kovalevet al.: Traj...
2021
-
[31]
T. Erez, Y . Tassa, and E. Todorov, ‘‘Simulation tools for model-based robotics: Comparison of bullet, havok, mujoco, ode and physx,’’ in2015 IEEE international conference on robotics and automation (ICRA). IEEE, 2015, pp. 4397–4404
2015
-
[32]
C. Liao, Y . Wang, X. Ding, Y . Ren, X. Duan, and J. He, ‘‘Performance comparison of typical physics engines using robot models with multiple joints,’’IEEE Robotics and Automation Letters, 2023
2023
-
[33]
Z. Xie, G. Berseth, P . Clary, J. Hurst, and M. V an de Panne, ‘‘Feedback control for cassie with deep reinforcement learning,’’ in2018 IEEE/RSJ In- ternational Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 1241–1246
2018
- [34]
- [35]
- [36]
- [37]
-
[38]
Aeran and H
A. Aeran and H. G. Lemu, ‘‘Time integration schemes in dynamic problems-effect of damping on numerical stability and accuracy,’’ in6th International Workshop of Advanced Manufacturing and Automation. At- lantis Press, 2016, pp. 213–220
2016
-
[39]
[Online]
STARKIT, ‘‘Roki-2,’’ https://starkit.su/roki-2/, 2025. [Online]. Available: https://starkit.su/roki-2/
2025
-
[40]
2016, arXiv e-prints, arXiv:1604.00772, doi: 10.48550/arXiv.1604.00772
N. Hansen, ‘‘The cma evolution strategy: A tutorial,’’arXiv preprint arXiv:1604.00772, 2016
-
[41]
[Online]
‘‘High torque,’’ 2025. [Online]. Available: https://github.com/ HighTorque-Robotics
2025
-
[42]
[Online]
STARKIT, ‘‘Aluminum servo motor,’’ https://starkit.su/servo-alum/, 2025. [Online]. Available: https://starkit.su/servo-alum/
2025
-
[43]
Raffin, A
A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann, ‘‘Stable-baselines3: Reliable reinforcement learning implementations,’’ Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021. [Online]. Available: http://jmlr.org/papers/v22/20-1364.html VYACHESLAV KOVALEVis a researcher at the Moscow Institute of Physics and Techn...
2021
-
[44]
Ekaterina is the author of seven scientific publications in the field of control systems
Her research interests include reinforcement learning for locomotion, legged robotics, and con- trol systems. Ekaterina is the author of seven scientific publications in the field of control systems. 14 VOLUME 11, 2023 Kovalevet al.: Trajectory-based Actuator Identification via Differentiable Simulation EGOR DAVYDENKOis a researcher at the Moscow Institut...
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.