pith. sign in

arxiv: 2510.07625 · v2 · submitted 2025-10-08 · 💻 cs.RO · cs.SY· eess.SY

GATO: GPU-Accelerated and Batched Trajectory Optimization for Scalable Edge Model Predictive Control

Pith reviewed 2026-05-18 08:29 UTC · model grok-4.3

classification 💻 cs.RO cs.SYeess.SY
keywords GPU accelerationtrajectory optimizationmodel predictive controlbatched optimizationroboticsreal-time controledge computingnonlinear optimization
0
0 comments X p. Extension

The pith

GATO delivers real-time batched nonlinear trajectory optimization on GPU for moderate batch sizes in model predictive control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents GATO as a GPU-accelerated solver for batches of nonlinear trajectory optimization problems that arise in model predictive control. It targets the regime of tens to low hundreds of simultaneous solves, where existing CPU methods are too slow and prior GPU methods either sacrifice speed or model generality. The core approach co-designs the algorithm, software stack, and hardware mapping to combine block-, warp-, and thread-level parallelism both within and across solves. This matters for robotics because many state-of-the-art MPC applications need multiple real-time solves for disturbance rejection and replanning on edge hardware. The authors demonstrate the result through scaling benchmarks, improved control behavior in case studies, and direct hardware validation on an industrial manipulator.

Core claim

GATO is an open-source GPU-accelerated batched trajectory optimization solver that combines block-, warp-, and thread-level parallelism to achieve real-time throughput for moderate batch sizes of nonlinear solves. It reports speedups of 18-21x over CPU baselines and 1.4-16x over other GPU baselines as batch size grows, together with better disturbance rejection and convergence, and is validated on physical hardware.

What carries the argument

Multi-level (block, warp, thread) parallelism applied within and across solves in a batched nonlinear trajectory optimization framework.

If this is right

  • Real-time model predictive control becomes practical on edge hardware for tasks that require simultaneous optimization of tens to low hundreds of trajectories.
  • Solver throughput improves with larger batch sizes, enabling better scalability in applications that benefit from multiple parallel plans.
  • Improved disturbance rejection and convergence rates are observed in simulated and hardware case studies.
  • The open-source release allows direct reproduction and integration into existing robotics control stacks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same multi-level parallelism strategy could be adapted to other batch optimization problems in robotics such as motion planning or parameter estimation.
  • Faster per-batch solve times may allow MPC to operate at higher replanning frequencies or with more complex dynamics models on the same hardware.
  • Energy use on embedded platforms could decrease because shorter computation windows leave more time in low-power states.

Load-bearing premise

That combining block-, warp-, and thread-level parallelism on the GPU produces no prohibitive synchronization costs and preserves generality for nonlinear problems at moderate batch sizes.

What would settle it

A benchmark run at batch sizes of 50-200 where GATO either falls below real-time rates, shows no speedup over a tuned CPU solver, or loses solution accuracy for the same nonlinear models.

Figures

Figures reproduced from arXiv: 2510.07625 by Alexander Du, Brian Plancher, Emre Adabag, Gabriel Bravo-Palacios.

Figure 1
Figure 1. Figure 1: The GATO solver parallelizes across batches of [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall design of our batched solver which a) forms problems in parallel across solves and timesteps, b) leverages [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: (Left) Solve times for 6-DoF manipulator motions while varying the batch size ( [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Average (normalized) merit function value across [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Figure-8 tracking task, with an external disturbance applied at the end effector. (Left) Bar chart shows tracking error, [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Simulation visualization at the last timestep of the [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Cumulative density function of the solve times for [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
read the original abstract

While Model Predictive Control (MPC) delivers strong performance across robotics applications, solving the underlying (batches of) nonlinear trajectory optimization (TO) problems online remains computationally demanding. Existing GPU-accelerated approaches either parallelize single solves, handle large batches at sub-real-time rates, or sacrifice model generality for speed. This leaves a large gap in solver performance for many state-of-the-art MPC applications that require real-time batches of tens to low-hundreds of solves. As such, we present GATO, an open source, GPU-accelerated, batched TO solver co-designed across algorithm, software, and computational hardware to deliver real-time throughput for these moderate batch size regimes. Our approach leverages a combination of block-, warp-, and thread-level parallelism within and across solves for ultra-high performance. We demonstrate the effectiveness of our approach through a combination of: simulated benchmarks showing speedups of 18-21x over CPU baselines and 1.4-16x over GPU baselines as batch size increases; case studies highlighting improved disturbance rejection and convergence behavior; and finally a validation on hardware using an industrial manipulator. We open source GATO to support reproducibility and adoption.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript presents GATO, an open-source GPU-accelerated batched trajectory optimization solver for model predictive control targeting moderate batch sizes of tens to low hundreds of solves. It claims to fill a performance gap by co-designing algorithm, software, and hardware with combined block-, warp-, and thread-level parallelism, reporting empirical speedups of 18-21x over CPU baselines and 1.4-16x over other GPU baselines, along with case studies on disturbance rejection and hardware validation on an industrial manipulator.

Significance. If the throughput claims hold with adequate analysis of overheads, the work would meaningfully advance real-time nonlinear MPC on edge hardware by supporting moderate batch regimes without sacrificing model generality. The open-sourcing for reproducibility and the inclusion of hardware experiments are clear strengths that enhance the practical value of the contribution.

major comments (1)
  1. Abstract: the central performance claims (18-21x CPU and 1.4-16x GPU speedups for moderate batch sizes) rest on the assumption that block-, warp-, and thread-level parallelism can be combined without prohibitive synchronization overhead or loss of generality for nonlinear TO. The abstract does not detail how data-dependent operations such as dynamics linearization or line search are scheduled to avoid warp divergence and cross-warp barriers in this batch-size regime, which is load-bearing for the real-time throughput assertion.
minor comments (2)
  1. The description of the GPU baselines would benefit from explicit statement of their batch-size scaling behavior and whether they also target moderate regimes, to strengthen the comparative claims.
  2. Consider adding a short table or paragraph summarizing the specific robot models, horizon lengths, and constraint types used in the simulated benchmarks to improve reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the practical value of GATO, including the open-source release and hardware validation. We address the single major comment below and outline a targeted revision to the manuscript.

read point-by-point responses
  1. Referee: Abstract: the central performance claims (18-21x CPU and 1.4-16x GPU speedups for moderate batch sizes) rest on the assumption that block-, warp-, and thread-level parallelism can be combined without prohibitive synchronization overhead or loss of generality for nonlinear TO. The abstract does not detail how data-dependent operations such as dynamics linearization or line search are scheduled to avoid warp divergence and cross-warp barriers in this batch-size regime, which is load-bearing for the real-time throughput assertion.

    Authors: We appreciate the referee highlighting the need for greater clarity on this point. The abstract is intentionally high-level to summarize the contribution and results. The detailed co-design of block-, warp-, and thread-level parallelism, along with the scheduling of data-dependent operations (dynamics linearization, line search, etc.) to control warp divergence and synchronization costs, is described in Sections 3 and 4 of the manuscript. Our implementation uses uniform batch processing, warp-level primitives for reductions, and kernel structures that minimize cross-warp barriers while preserving full nonlinear model generality. The reported speedups are measured end-to-end and already incorporate all overheads, as shown in the scaling benchmarks of Section 5. To make the abstract more self-contained and directly address the referee's concern, we will revise it to include a concise statement on the scheduling approach for data-dependent operations. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical performance claims

full rationale

The paper presents GATO as an implemented co-designed GPU solver for batched nonlinear trajectory optimization and supports its real-time throughput claims exclusively through direct empirical benchmarks (speedups of 18-21x over CPU and 1.4-16x over other GPU baselines) plus hardware validation on an industrial manipulator. These are measured outcomes from simulated and physical tests rather than any derived predictions, first-principles results, or fitted parameters that reduce to the inputs by construction. No equations, uniqueness theorems, ansatzes, or self-citations are invoked as load-bearing steps in the provided claims; the central argument rests on the observed behavior of the block/warp/thread parallelism implementation itself, which is externally falsifiable via the open-sourced code and independent re-runs. The derivation chain is therefore self-contained in the engineering and benchmarking methodology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions from nonlinear optimization and parallel computing literature; no new free parameters or invented entities are introduced beyond the software artifact itself.

axioms (1)
  • domain assumption Nonlinear trajectory optimization problems in MPC can be solved reliably with standard numerical methods when sufficient compute is available.
    Invoked implicitly when claiming real-time performance for general models.

pith-pipeline@v0.9.0 · 5752 in / 1112 out tokens · 40102 ms · 2026-05-18T08:29:39.446115+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Vectorizing Projection in Manifold-Constrained Motion Planning for Real-Time Whole-Body Control

    cs.RO 2026-04 conditional novelty 6.0

    Vectorizing projection operations enables real-time manifold-constrained motion planning for humanoid robots with 100-1000x speedups over prior methods.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    Reactive planar manipula- tion with convex hybrid mpc,

    F. R. Hogan, E. R. Grau, and A. Rodriguez, “Reactive planar manipula- tion with convex hybrid mpc,” in2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 247–253

  2. [2]

    A unified mpc framework for whole-body dynamic locomotion and manipula- tion,

    J.-P. Sleiman, F. Farshidian, M. V . Minniti, and M. Hutter, “A unified mpc framework for whole-body dynamic locomotion and manipula- tion,”IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4688– 4695, 2021

  3. [3]

    Cerberus in the darpa subterranean challenge,

    M. Tranzatto, T. Miki, M. Dharmadhikari, L. Bernreiter, M. Kulkarni, F. Mascarich, O. Andersson, S. Khattak, M. Hutter, R. Siegwart,et al., “Cerberus in the darpa subterranean challenge,”Science Robotics, vol. 7, no. 66, p. eabp9742, 2022

  4. [4]

    Optimization-based control for dynamic legged robots,

    P. M. Wensing, M. Posa, Y . Hu, A. Escande, N. Mansard, and A. Del Prete, “Optimization-based control for dynamic legged robots,” IEEE Transactions on Robotics, 2023

  5. [5]

    Taskable agility: Making useful dynamic behavior easier to create,

    S. Kuindersma, “Taskable agility: Making useful dynamic behavior easier to create,” Princeton Robotics Seminar, April 2023

  6. [6]

    J. T. Betts,Practical methods for optimal control and estimation using nonlinear programming. SIAM, 2010

  7. [7]

    Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot,

    S. Kuindersma, R. Deits, M. Fallon, A. Valenzuela, H. Dai, F. Per- menter, T. Koolen, P. Marion, and R. Tedrake, “Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot,”Autonomous robots, vol. 40, pp. 429–455, 2016

  8. [8]

    Cafe-mpc: A cascaded-fidelity model predictive control framework with tuning-free whole-body control,

    H. Li and P. M. Wensing, “Cafe-mpc: A cascaded-fidelity model predictive control framework with tuning-free whole-body control,” arXiv preprint arXiv:2403.03995, 2024

  9. [9]

    Tinympc: Model-predictive control on resource-constrained micro- controllers,

    K. Nguyen, S. Schoedel, A. Alavilli, B. Plancher, and Z. Manchester, “Tinympc: Model-predictive control on resource-constrained micro- controllers,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 1–7

  10. [10]

    Model predictive path integral control: From theory to parallel computation,

    G. Williams, A. Aldrich, and E. A. Theodorou, “Model predictive path integral control: From theory to parallel computation,”Journal of Guidance, Control, and Dynamics, vol. 40, no. 2, pp. 344–357, 2017

  11. [11]

    Mppi- generic: A cuda library for stochastic trajectory optimization,

    B. Vlahov, J. Gibson, M. Gandhi, and E. A. Theodorou, “Mppi- generic: A cuda library for stochastic trajectory optimization,”arXiv preprint arXiv:2409.07563, 2024

  12. [12]

    Full-order sampling-based mpc for torque-level locomotion control via diffusion-style annealing,

    H. Xue, C. Pan, Z. Yi, G. Qu, and G. Shi, “Full-order sampling-based mpc for torque-level locomotion control via diffusion-style annealing,” arXiv preprint arXiv:2409.15610, 2024

  13. [13]

    Real-time whole-body control of legged robots with model- predictive path integral control,

    J. Alvarez-Padilla, J. Z. Zhang, S. Kwok, J. M. Dolan, and Z. Manch- ester, “Real-time whole-body control of legged robots with model- predictive path integral control,”arXiv preprint arXiv:2409.10469, 2024

  14. [14]

    Comparison of nmpc and gpu- parallelized mppi for real-time uav control on embedded hardware,

    R. Enrico, M. Mancini, and E. Capello, “Comparison of nmpc and gpu- parallelized mppi for real-time uav control on embedded hardware,” Applied Sciences, vol. 15, no. 16, p. 9114, 2025

  15. [15]

    A performance analysis of parallel differential dynamic programming on a gpu,

    B. Plancher and S. Kuindersma, “A performance analysis of parallel differential dynamic programming on a gpu,” inProceedings of the 13th Workshop on the Algorithmic F oundations of Robotics. Springer, 2018, pp. 656–672

  16. [16]

    Gpu-based contact-aware trajectory optimization using a smooth force model,

    Z. Pan, B. Ren, and D. Manocha, “Gpu-based contact-aware trajectory optimization using a smooth force model,” inProceedings of the 18th annual ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2019, pp. 1–12

  17. [17]

    Gpu-parallelized iterative lqr with input constraints for fast collision avoidance of autonomous vehicles,

    Y . Lee, M. Cho, and K.-S. Kim, “Gpu-parallelized iterative lqr with input constraints for fast collision avoidance of autonomous vehicles,” in2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 4797–4804

  18. [18]

    Exploit- ing gpu/simd architectures for solving linear-quadratic mpc problems,

    D. Cole, S. Shin, F. Pacaud, V . M. Zavala, and M. Anitescu, “Exploit- ing gpu/simd architectures for solving linear-quadratic mpc problems,” in2023 American Control Conference (ACC). IEEE, 2023, pp. 3995– 4000

  19. [19]

    Accelerating Optimal Power Flow with GPUs: SIMD Abstraction of Nonlinear Programs and Condensed-Space Interior-Point Methods

    S. Shin, F. Pacaud, and M. Anitescu, “Accelerating optimal power flow with gpus: Simd abstraction of nonlinear programs and condensed- space interior-point methods,”arXiv preprint arXiv:2307.16830, 2023

  20. [20]

    Curobo: Parallelized collision-free robot motion generation,

    B. Sundaralingam, S. K. S. Hari, A. Fishman, C. Garrett, K. Van Wyk, V . Blukis, A. Millane, H. Oleynikova, A. Handa, F. Ramos,et al., “Curobo: Parallelized collision-free robot motion generation,” in2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 8112–8119

  21. [21]

    Mpcgpu: Real-time nonlinear model predictive control through preconditioned conjugate gradient on the gpu,

    E. Adabag, M. Atal, W. Gerard, and B. Plancher, “Mpcgpu: Real-time nonlinear model predictive control through preconditioned conjugate gradient on the gpu,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 9787–9794

  22. [22]

    Gpu-enabled parallel trajectory optimization framework for safe motion planning of autonomous vehicles,

    Y . Lee, K. H. Choi, and K.-S. Kim, “Gpu-enabled parallel trajectory optimization framework for safe motion planning of autonomous vehicles,”IEEE Robotics and Automation Letters, 2024

  23. [23]

    Cusadi: A gpu parallelization framework for symbolic expressions and optimal control,

    S. H. Jeon, S. Hong, H. J. Lee, C. Khazoom, and S. Kim, “Cusadi: A gpu parallelization framework for symbolic expressions and optimal control,”IEEE Robotics and Automation Letters, 2024

  24. [24]

    Relu-qp: A gpu-accelerated quadratic programming solver for model-predictive control,

    A. L. Bishop, J. Z. Zhang, S. Gurumurthy, K. Tracy, and Z. Manch- ester, “Relu-qp: A gpu-accelerated quadratic programming solver for model-predictive control,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 13 285–13 292

  25. [25]

    On the differentiability of the primal- dual interior-point method,

    K. Tracy and Z. Manchester, “On the differentiability of the primal- dual interior-point method,”arXiv preprint arXiv:2406.11749, 2024

  26. [26]

    Primal-dual ilqr for gpu-accelerated learning and control in legged robots,

    L. Amatucci, J. Sousa-Pinto, G. Turrisi, D. Orban, V . Barasuol, and C. Semini, “Primal-dual ilqr for gpu-accelerated learning and control in legged robots,”arXiv preprint arXiv:2506.07823, 2025

  27. [27]

    Incomplete-lu and cholesky preconditioned iterative methods using cusparse and cublas,

    M. Naumov, “Incomplete-lu and cholesky preconditioned iterative methods using cusparse and cublas,”Nvidia white paper, vol. 3, 2011

  28. [28]

    Gpu acceleration of admm for large-scale quadratic programming,

    M. Schubiger, G. Banjac, and J. Lygeros, “Gpu acceleration of admm for large-scale quadratic programming,”Journal of Parallel and Distributed Computing, vol. 144, pp. 55–67, 2020

  29. [29]

    Accelerating robot dynamics gradients on a cpu, gpu, and fpga,

    B. Plancher, S. M. Neuman, T. Bourgeat, S. Kuindersma, S. Devadas, and V . J. Reddi, “Accelerating robot dynamics gradients on a cpu, gpu, and fpga,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2335–2342, 2021

  30. [30]

    Grid: Gpu-accelerated rigid body dynamics with analytical gradients,

    B. Plancher, S. M. Neuman, R. Ghosal, S. Kuindersma, and V . J. Reddi, “Grid: Gpu-accelerated rigid body dynamics with analytical gradients,” in2022 International Conference on Robotics and Automa- tion (ICRA). IEEE, 2022, pp. 6253–6260

  31. [31]

    Accelerating condensed interior-point methods on simd/gpu architec- tures,

    F. Pacaud, S. Shin, M. Schanen, D. A. Maldonado, and M. Anitescu, “Accelerating condensed interior-point methods on simd/gpu architec- tures,”Journal of Optimization Theory and Applications, pp. 1–20, 2023

  32. [32]

    Fast generation of collision- free trajectories for robot swarms using gpu acceleration,

    M. Hamer, L. Widmer, and R. D’andrea, “Fast generation of collision- free trajectories for robot swarms using gpu acceleration,”IEEE Access, vol. 7, pp. 6679–6690, 2018

  33. [33]

    Fast joint multi-robot trajectory optimization by gpu accelerated batch solution of distributed sub-problems,

    D. Guhathakurta, F. Rastgar, M. A. Sharma, K. M. Krishna, and A. K. Singh, “Fast joint multi-robot trajectory optimization by gpu accelerated batch solution of distributed sub-problems,”Frontiers in Robotics and AI, vol. 9, p. 890385, 2022

  34. [34]

    Gpu accelerated batch trajectory optimization for autonomous navi- gation,

    F. Rastgar, H. Masnavi, K. Kruusam ¨ae, A. Aabloo, and A. K. Singh, “Gpu accelerated batch trajectory optimization for autonomous navi- gation,” in2023 American Control Conference (ACC). IEEE, 2023, pp. 718–725

  35. [35]

    Gait optimization for legged systems through mixed distribution cross-entropy optimization,

    I. Tsikelis and K. Chatzilygeroudis, “Gait optimization for legged systems through mixed distribution cross-entropy optimization,” in 2024 IEEE-RAS 23rd International Conference on Humanoid Robots (Humanoids). IEEE, 2024, pp. 1011–1018

  36. [36]

    Risk-averse model predictive control for racing in adverse conditions,

    T. Lew, M. Greiff, F. Djeumou, M. Suminaka, M. Thompson, and J. Subosits, “Risk-averse model predictive control for racing in adverse conditions,”arXiv preprint arXiv:2410.17183, 2024

  37. [37]

    Nocedal and S

    J. Nocedal and S. J. Wright,Numerical optimization. Springer, 1999

  38. [38]

    On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear program- ming,

    A. W ¨achter and L. T. Biegler, “On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear program- ming,”Mathematical programming, vol. 106, pp. 25–57, 2006

  39. [39]

    Snopt: An sqp algorithm for large-scale constrained optimization,

    P. E. Gill, W. Murray, and M. A. Saunders, “Snopt: An sqp algorithm for large-scale constrained optimization,”SIAM review, vol. 47, no. 1, pp. 99–131, 2005

  40. [40]

    Symmetric stair preconditioning of linear sys- tems for parallel trajectory optimization,

    X. Bu and B. Plancher, “Symmetric stair preconditioning of linear sys- tems for parallel trajectory optimization,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 9779–9786

  41. [41]

    Osqp: An operator splitting solver for quadratic programs,

    B. Stellato, G. Banjac, P. Goulart, A. Bemporad, and S. Boyd, “Osqp: An operator splitting solver for quadratic programs,”Mathematical Programming Computation, vol. 12, no. 4, pp. 637–672, 2020

  42. [42]

    The pinocchio c++ library: A fast and flexible implementation of rigid body dynamics algorithms and their analytical derivatives,

    J. Carpentier, G. Saurel, G. Buondonno, J. Mirabel, F. Lamiraux, O. Stasse, and N. Mansard, “The pinocchio c++ library: A fast and flexible implementation of rigid body dynamics algorithms and their analytical derivatives,” in2019 IEEE/SICE International Symposium on System Integration (SII). IEEE, 2019, pp. 614–619

  43. [43]

    High- frequency nonlinear model predictive control of a manipulator,

    S. Kleff, A. Meduri, R. Budhiraja, N. Mansard, and L. Righetti, “High- frequency nonlinear model predictive control of a manipulator,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 7330–7336

  44. [44]

    Improvements to the Levenberg-Marquardt algorithm for nonlinear least-squares minimization

    M. K. Transtrum and J. P. Sethna, “Improvements to the levenberg- marquardt algorithm for nonlinear least-squares minimization,” 2012. [Online]. Available: https://arxiv.org/abs/1201.5885

  45. [45]

    Predictive sampling: Real-time behaviour synthesis with mujoco,

    T. Howell, N. Gileadi, S. Tunyasuvunakool, K. Zakka, T. Erez, and Y . Tassa, “Predictive sampling: Real-time behaviour synthesis with mujoco,”arXiv preprint arXiv:2212.00541, 2022

  46. [46]

    Bundled gradients through contact via randomized smoothing,

    H. J. T. Suh, T. Pang, and R. Tedrake, “Bundled gradients through contact via randomized smoothing,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4000–4007, 2022

  47. [47]

    Cacto-sl: Using sobolev learning to improve continuous actor-critic with trajectory optimization,

    E. Alboni, G. Grandesso, G. P. R. Papini, J. Carpentier, and A. Del Prete, “Cacto-sl: Using sobolev learning to improve continuous actor-critic with trajectory optimization,” in6th Annual Learning for Dynamics & Control Conference. PMLR, 2024, pp. 1452–1463

  48. [48]

    Warm start of mixed-integer programs for model predictive control of hybrid systems,

    T. Marcucci and R. Tedrake, “Warm start of mixed-integer programs for model predictive control of hybrid systems,”IEEE Transactions on Automatic Control, vol. 66, no. 6, pp. 2433–2448, 2020

  49. [49]

    Nvidia orin system-on-chip,

    M. Ditty, “Nvidia orin system-on-chip,” in2022 IEEE Hot Chips 34 Symposium (HCS). IEEE Computer Society, 2022, pp. 1–17