pith. sign in

arxiv: 2605.12628 · v1 · pith:RG5UNAH6new · submitted 2026-05-12 · 💻 cs.RO

Multistep Belief Space Dynamics Learning For Risk-Aware Control

Pith reviewed 2026-05-14 20:41 UTC · model grok-4.3

classification 💻 cs.RO
keywords risk-aware controlmodel predictive controldistributional dynamicsautonomous vehiclesoff-road drivingbelief spaceuncertainty prediction
0
0 comments X

The pith

A structured multistep approach to learning distributional dynamics enables risk-aware MPC that naturally regulates speed in off-road driving.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a learning framework that predicts how dynamical uncertainty evolves over multiple future steps so that model predictive control can optimize plans accounting for risk in real time. Structure in the learned dynamics is shown to be critical through an ablation study on a large real-world off-road dataset, where deviations produce worse control behavior. Deployment on a full-sized vehicle confirms that the resulting planner adjusts speed intelligently across miles of varied terrain instead of defaulting to excessive caution.

Core claim

The central claim is that multistep belief-space dynamics learned with specific structural constraints can be optimized directly inside MPC, producing risk-aware trajectories that evolve uncertainty predictions forward in time without introducing hidden conservatism, as validated by ablation results on off-road data and by closed-loop behavior on a physical vehicle.

What carries the argument

Structured multistep predictor of distributional dynamics inside MPC that models uncertainty propagation for real-time optimization.

If this is right

  • MPC can optimize trajectories that incorporate multi-step uncertainty evolution without defaulting to overly cautious plans.
  • Vehicle speed regulation emerges naturally from the predicted distributional dynamics rather than from hand-tuned rules.
  • Ablation confirms that removing the structural elements in the learner measurably harms closed-loop performance.
  • The same learned model supports consistent intelligent behavior across miles of diverse off-road terrain.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same multistep distributional predictor could be transferred to other platforms where uncertainty grows rapidly, such as aerial or marine robots.
  • Longer planning horizons become feasible once the dynamics model already carries forward distributional information.
  • Integration with additional sensing modalities would likely tighten the uncertainty predictions and further reduce conservatism.

Load-bearing premise

That preserving the proposed structure during distributional dynamics learning is required to prevent material degradation in MPC performance and that the learned model generalizes reliably outside the collected off-road dataset.

What would settle it

Running the identical MPC planner on the same off-road dataset once with the structural constraints removed and once with them intact, then measuring whether speed regulation becomes either excessively conservative or unstable.

Figures

Figures reproduced from arXiv: 2605.12628 by Bogdan Vlahov, Evangelos A. Theodorou, Jason Gibson, Patrick Spieler.

Figure 1
Figure 1. Figure 1: Architecture of uncertainty propagation and risk measure computation [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: State trajectory and sigma point plots of the HW model with and without the closed-loop gains. In Fig. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A comparison of open loop trajectories and sigma points from the three different models used in Fig. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Box plots showing total summed NLL loss for various structural changes to the networks. Whiskers are defined as [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Box plots showing total summed NLL loss for various changes to the size or inputs of the network. Whiskers are defined as [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: A view into the long traverse discused in Section [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Speed maps for the first case study discussed in Section [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Shows top down view of v x and images for the second case study looked at in Section V-B2. The top down map and images are marked by numbers in purple that correlate across the entire figure. control and speeds toward the waypoint. At the peak of the second slide the vehicle is traveling at ≈ 5m/s in the body y direction Fig. 7b. The vehicle does not hit any obstacle during the slide and continues to apply… view at source ↗
Figure 9
Figure 9. Figure 9: Shows top down view of v x and images for the different runs of the trail course outlined in Section V-B3. The top down map and images are marked by numbers in purple that correlate with the map shown to the left of it. Fig. 9a and Fig. 9b is the first and Fig. 9d and Fig. 9e is the second pair taken from the same location and can be compared between T1 and T2. T1 is using cσ = 2.0 and T2 uses cσ = 1.0 but… view at source ↗
Figure 10
Figure 10. Figure 10: Box plots showing distance error at the end of the [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Box plots showing ψ error at the end of the 5s trajectory for various structural changes to the networks. Whiskers are defined as ±1.5IQR and is given with the arrows, the green line defines the median and the orange the mean value. The baseline model (bline) is kept consistent. The explanations of the keys used in shown in the text Section V-A1 [PITH_FULL_IMAGE:figures/full_fig_p023_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Box plots showing v x error at the end of the 5s trajectory for various structural changes to the networks. Whiskers are defined as ±1.5IQR and is given with the arrows, the green line defines the median and the orange the mean value. The baseline model (bline) is kept consistent. The explanations of the keys used in shown in the text Section V-A1 [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Box plots showing v y error at the end of the 5s trajectory for various structural changes to the networks. Whiskers are defined as ±1.5IQR and is given with the arrows, the green line defines the median and the orange the mean value. The baseline model (bline) is kept consistent. The explanations of the keys used in shown in the text Section V-A1 [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Box plots showing ψ˙ error at the end of the 5s trajectory for various structural changes to the networks. Whiskers are defined as ±1.5IQR and is given with the arrows, the green line defines the median and the orange the mean value. The baseline model (bline) is kept consistent. The explanations of the keys used in shown in the text Section V-A1 [PITH_FULL_IMAGE:figures/full_fig_p026_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Box plots showing δ (steering angle, in range −5, 5) error at the end of the 5s trajectory for various structural changes to the networks. Whiskers are defined as ±1.5IQR and is given with the arrows, the green line defines the median and the orange the mean value. The baseline model (bline) is kept consistent. The explanations of the keys used in shown in the text Section V-A1 [PITH_FULL_IMAGE:figures/f… view at source ↗
Figure 16
Figure 16. Figure 16: Box plots showing e (engine RPM) error at the end of the 5s trajectory for various structural changes to the networks. Whiskers are defined as ±1.5IQR and is given with the arrows, the green line defines the median and the orange the mean value. The baseline model (bline) is kept consistent. The explanations of the keys used in shown in the text Section V-A1 [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Full set of images from marker 1 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p030_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Full set of images from marker 2 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p030_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Full set of images from marker 3 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p031_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Full set of images from marker 4 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p031_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Full set of images from marker 5 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p032_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Full set of images from marker 6 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p032_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Full set of images from marker 1 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p033_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Full set of images from marker 2 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p033_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Full set of images from marker 3 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p034_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Full set of images from marker 4 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p034_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: Full set of images from marker 1 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p035_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: Full set of images from marker 2 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p035_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Full set of images from marker 3 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p036_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: Full set of images from marker 4 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p036_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: Full set of images from marker 5 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p037_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: Full set of images from marker 1 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p037_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Full set of images from marker 2 in Fig. [PITH_FULL_IMAGE:figures/full_fig_p038_33.png] view at source ↗
read the original abstract

As autonomous vehicles move from a simplified research setting to practical use, there exists a large gap between the dynamic behavior of a human driving and an autonomous system. Risk-aware behavior needs to naturally develop in order to scale to the demands of the real world. A major issue for risk-aware planning and control has been predicting how dynamical uncertainty evolves through time and optimizing plans that account for this without being overly conservative. Here, we present a learning framework to predict distributional dynamics that can be optimized in real time for Model Predictive Control (MPC). We explore the importance of structure when learning distributional dynamics for use in MPC. A rigorous ablation study is conducted on a large dataset of real world off-road driving that shows the impact of deviations from our proposed structure. Furthermore, we deploy our learned model and planning stack on a full sized vehicle in challenging off-road conditions. Our planning architecture is able to naturally regulate the speed of the vehicle based on the environment and consistently demonstrates intelligent behavior over miles of diverse terrain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents a learning framework for multistep belief-space distributional dynamics that can be optimized in real time within a model-predictive control (MPC) loop for risk-aware autonomous driving. It stresses the importance of specific structural choices in the learned model, supports this via a quantitative ablation study on a large real-world off-road driving dataset, and reports closed-loop deployment on a full-sized vehicle that exhibits natural speed regulation and consistent intelligent behavior over miles of diverse terrain.

Significance. If the empirical claims hold, the work offers a practical route to non-conservative risk-aware planning by learning distributional dynamics directly from real data rather than hand-crafted uncertainty models. The combination of a controlled ablation on a substantial off-road corpus and successful full-vehicle deployment supplies concrete evidence that structured distributional learning can be integrated into real-time MPC without introducing hidden instability or excessive conservatism.

major comments (2)
  1. [§4] §4 (Ablation Study): the quantitative results demonstrate performance differences when structural components are removed, yet the paper does not report the precise MPC cost terms or risk-measure definitions used to generate the tabulated metrics; without these, it is difficult to judge whether the observed gains are attributable to the distributional structure or to incidental changes in the optimization objective.
  2. [§5] §5 (Vehicle Deployment): the claim that the architecture 'naturally regulates the speed' and 'demonstrates intelligent behavior' is supported by qualitative description and mileage figures, but no quantitative comparison (e.g., mean speed, time-to-collision statistics, or risk-cost histograms) against a non-learned baseline MPC is provided; this weakens the assertion that the learned model generalizes beyond the training distribution without introducing hidden conservatism.
minor comments (2)
  1. [Abstract] The abstract omits any mention of model architecture, training procedure, or numerical performance metrics; adding one sentence summarizing these elements would improve readability and allow readers to assess the central claims immediately.
  2. [§3] Notation for the belief-space representation and the multistep rollout operator is introduced without an explicit equation reference in the main text; a single displayed equation defining the distributional transition would clarify the subsequent ablation discussion.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We are grateful to the referee for the detailed and constructive feedback, as well as the recommendation for minor revision. We address each major comment below.

read point-by-point responses
  1. Referee: [§4] §4 (Ablation Study): the quantitative results demonstrate performance differences when structural components are removed, yet the paper does not report the precise MPC cost terms or risk-measure definitions used to generate the tabulated metrics; without these, it is difficult to judge whether the observed gains are attributable to the distributional structure or to incidental changes in the optimization objective.

    Authors: We concur that the lack of explicit MPC cost terms and risk-measure definitions in the ablation study section makes it challenging to fully attribute the performance gains. In the revised manuscript, we will include the precise formulation of the MPC objective, specifying the risk measures employed (e.g., the exact definition and parameters of the distributional risk metric) and all associated cost weights. This addition will allow readers to verify that the improvements stem from the proposed structural choices in the belief-space dynamics model rather than variations in the optimization setup. revision: yes

  2. Referee: [§5] §5 (Vehicle Deployment): the claim that the architecture 'naturally regulates the speed' and 'demonstrates intelligent behavior' is supported by qualitative description and mileage figures, but no quantitative comparison (e.g., mean speed, time-to-collision statistics, or risk-cost histograms) against a non-learned baseline MPC is provided; this weakens the assertion that the learned model generalizes beyond the training distribution without introducing hidden conservatism.

    Authors: We thank the referee for highlighting this point. Although a direct quantitative comparison with a non-learned baseline MPC was not conducted during the real-world deployment for reasons of experimental safety and setup complexity, we will enhance the deployment section with additional quantitative analyses from the collected vehicle data. This will include statistics such as mean and variance of vehicle speeds over different terrain segments, distributions of computed risk costs, and any available time-to-collision metrics. These additions should provide a more robust quantitative backing for the observed intelligent behavior and generalization capabilities. revision: partial

standing simulated objections not resolved
  • Quantitative comparison against a non-learned baseline MPC in the closed-loop vehicle deployment

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents a learning framework that trains distributional dynamics models on external real-world off-road driving data, then deploys the resulting model inside an MPC planner. All load-bearing claims (speed regulation, intelligent behavior over miles of terrain, ablation impact of structure) are supported by empirical evaluation on held-out data and closed-loop vehicle tests rather than by re-deriving outputs from fitted parameters or self-citations. No equation reduces a prediction to its own input by construction, no uniqueness theorem is imported from prior author work, and no ansatz is smuggled via citation. The architecture therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that distributional dynamics learned from data can be optimized in MPC without post-hoc conservatism, plus standard supervised learning assumptions on data quality and model capacity.

free parameters (1)
  • distributional model hyperparameters
    Parameters of the learned dynamics model are fitted to the off-road dataset; exact values and selection process not specified in abstract.
axioms (1)
  • domain assumption Uncertainty in vehicle dynamics evolves in a learnable distributional manner over multiple steps
    Invoked as the basis for the learning framework and its use in MPC.

pith-pipeline@v0.9.0 · 5475 in / 1188 out tokens · 49131 ms · 2026-05-14T20:41:08.745430+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 1 internal anchor

  1. [1]

    A comprehensive review on autonomous navigation,

    S. Nahavandi, R. Alizadehsani, D. Nahavandi, S. Mohamed, N. Mohajer, M. Rokonuzzaman, and I. Hossain, “A comprehensive review on autonomous navigation,”ACM Comput. Surv., vol. 57, no. 9, May

  2. [2]

    Available: https://doi.org/10.1145/3727642 1

    [Online]. Available: https://doi.org/10.1145/3727642 1

  3. [3]

    Parting with misconcep- tions about learning-based vehicle motion planning

    D. Dauner, M. Hallgarten, A. Geiger, and K. Chitta, “Parting with misconceptions about learning-based vehicle motion planning,” no. arXiv:2306.07962, Nov. 2023, arXiv:2306.07962 [cs]. [Online]. Available: http://arxiv.org/abs/2306.07962 1, 2

  4. [4]

    Quantifying generalization in reinforcement learning,

    K. Cobbe, O. Klimov, C. Hesse, T. Kim, and J. Schulman, “Quantifying generalization in reinforcement learning,” inProceedings of the 36th International Conference on Machine Learning. PMLR, May 2019, p. 1282–1289. [Online]. Available: https://proceedings.mlr.press/v97/ cobbe19a.html 1

  5. [5]

    A survey on unmanned surface vehicles for disaster robotics: Main challenges and directions,

    V . A. M. Jorge, R. Granada, R. G. Maidana, D. A. Jurak, G. Heck, A. P. F. Negreiros, D. H. dos Santos, L. M. G. Gonc ¸alves, and A. M. Amory, “A survey on unmanned surface vehicles for disaster robotics: Main challenges and directions,”Sensors, vol. 19, no. 3, 2019. [Online]. Available: https://www.mdpi.com/1424-8220/19/3/702 2

  6. [6]

    Impacts of model fidelity on trajectory optimization for autonomous vehicles in extreme maneuvers,

    J. K. Subosits and J. C. Gerdes, “Impacts of model fidelity on trajectory optimization for autonomous vehicles in extreme maneuvers,”IEEE Transactions on Intelligent Vehicles, vol. 6, no. 3, pp. 546–558, 2021. 2, 3, 6, 7

  7. [7]

    Autorally: An open platform for aggressive autonomous driving,

    B. Goldfain, P. Drews, C. You, M. Barulic, O. Velev, P. Tsiotras, and J. M. Rehg, “Autorally: An open platform for aggressive autonomous driving,”IEEE Control Systems Magazine, vol. 39, no. 1, pp. 26–55,

  8. [8]

    A multi-step dynamics modeling framework for autonomous driving in multiple environments,

    J. Gibson, B. Vlahov, D. Fan, P. Spieler, D. Pastor, A.-a. Agha- mohammadi, and E. A. Theodorou, “A multi-step dynamics modeling framework for autonomous driving in multiple environments,” in2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 7959–7965. 2, 3, 5, 6, 10

  9. [9]

    Low frequency sampling in model predictive path integral control,

    B. Vlahov, J. Gibson, D. D. Fan, P. Spieler, A.-a. Agha-mohammadi, and E. A. Theodorou, “Low frequency sampling in model predictive path integral control,”IEEE Robotics and Automation Letters, vol. 9, no. 5, pp. 4543–4550, 2024. 2, 7, 9, 10

  10. [10]

    Dynamics models in the aggressive off-road driving regime,

    T. Han, S. Talia, R. Panicker, P. Shah, N. Jawale, and B. Boots, “Dynamics models in the aggressive off-road driving regime,” 2024. [Online]. Available: https://arxiv.org/abs/2405.16487 2, 3, 6, 10

  11. [11]

    Information theoretic mpc for model-based reinforcement learning,

    G. Williams, N. Wagener, B. Goldfain, P. Drews, J. M. Rehg, B. Boots, and E. A. Theodorou, “Information theoretic mpc for model-based reinforcement learning,” in2017 IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 1714–1721. 2, 7, 10

  12. [12]

    Mppi- generic: A cuda library for stochastic trajectory optimization,

    B. Vlahov, J. Gibson, M. Gandhi, and E. A. Theodorou, “MPPI-Generic: a CUDA library for stochastic optimization,” 2024. [Online]. Available: https://arxiv.org/abs/2409.07563 2, 7, 13

  13. [13]

    Controlling uncertainty: A review of human behavior in complex dynamic environments,

    M. Osman, “Controlling uncertainty: A review of human behavior in complex dynamic environments,”Psychological Bulletin, vol. 136, no. 1, p. 65–86, 2010. 2

  14. [14]

    Toward automated vehicle control beyond the stability limits: Drifting along a general path,

    J. Y . Goh, T. Goel, and J. Christian Gerdes, “Toward automated vehicle control beyond the stability limits: Drifting along a general path,”Journal of Dynamic Systems, Measurement, and Control, vol. 142, no. 2, p. 021004, 11 2019. [Online]. Available: https: //doi.org/10.1115/1.4045320 3

  15. [15]

    Tyre modelling for use in vehicle dynamics studies,

    E. Bakker, L. Nyborg, and H. B. Pacejka, “Tyre modelling for use in vehicle dynamics studies,”SAE Transactions, vol. 96, pp. 190–204,

  16. [16]

    Available: http://www.jstor.org/stable/44470677 3

    [Online]. Available: http://www.jstor.org/stable/44470677 3

  17. [17]

    H. B. Pacejka and H. Pacejka,Tyre and vehicle dynamics, 2nd ed. Woburn, MA: Butterworth-Heinemann, Dec. 2005. 3

  18. [18]

    Dynamics modeling using visual terrain features for high-speed autonomous off-road driving,

    J. Gibson, A. Alavilli, E. Tevere, E. A. Theodorou, and P. Spieler, “Dynamics modeling using visual terrain features for high-speed autonomous off-road driving,” 2024. [Online]. Available: https: //arxiv.org/abs/2412.00581 3, 6, 10

  19. [19]

    Deep dynamics: Vehicle dynamics modeling with a physics-constrained neural network for autonomous racing,

    J. Chrosniak, J. Ning, and M. Behl, “Deep dynamics: Vehicle dynamics modeling with a physics-constrained neural network for autonomous racing,”IEEE Robotics and Automation Letters, vol. 9, no. 6, p. 5292–5297, Jun. 2024. 3

  20. [20]

    Autonomous drifting with 3 minutes of data via learned tire models,

    F. Djeumou, J. Y . M. Goh, U. Topcu, and A. Balachandran, “Autonomous drifting with 3 minutes of data via learned tire models,” no. arXiv:2306.06330, Jun. 2023, arXiv:2306.06330 [cs, eess]. [Online]. Available: http://arxiv.org/abs/2306.06330 3

  21. [21]

    Multivariate uncertainty in deep learning,

    R. L. Russell and C. Reale, “Multivariate uncertainty in deep learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 7937–7943, 2022. 3

  22. [22]

    Kalmannet: Neural network aided kalman filtering for partially known dynamics,

    G. Revach, N. Shlezinger, X. Ni, A. L. Escoriza, R. J. G. van Sloun, and Y . C. Eldar, “Kalmannet: Neural network aided kalman filtering for partially known dynamics,”IEEE Transactions on Signal Processing, vol. 70, p. 1532–1547, 2022. 3

  23. [23]

    A learning-based noise tracking method of adaptive kalman filter for uav positioning,

    H. Luo, Y . Luo, B. Han, and M. Zeng, “A learning-based noise tracking method of adaptive kalman filter for uav positioning,” in 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Oct. 2022, p. 440–445. [Online]. Available: https: //ieeexplore.ieee.org/document/9922222/?arnumber=9922222 3

  24. [24]

    Polynomial chaos: A tutorial and critique from a statisti- cian’s perspective,

    A. O’Hagan, “Polynomial chaos: A tutorial and critique from a statisti- cian’s perspective,” 2013. 3

  25. [25]

    Probabilistic dynamic modeling and control for skid-steered mobile robots in off-road environ- ments,

    A. Trivedi, S. Bazzi, M. Zolotas, and T. Padır, “Probabilistic dynamic modeling and control for skid-steered mobile robots in off-road environ- ments,” in2023 IEEE International Conference on Assured Autonomy (ICAA), Jun. 2023, p. 57–60. 3

  26. [26]

    Data-driven sampling based stochastic mpc for skid-steer mobile robot navigation,

    A. Trivedi, S. Prajapati, A. Shirgaonkar, M. Zolotas, and T. Padir, “Data-driven sampling based stochastic mpc for skid-steer mobile robot navigation,” no. arXiv:2411.03289, Nov. 2024, arXiv:2411.03289 [cs]. [Online]. Available: http://arxiv.org/abs/2411.03289 3

  27. [27]

    Gp-ukf: Unscented kalman filters with gaussian process prediction and observation models,

    J. Ko, D. J. Klein, D. Fox, and D. Haehnel, “Gp-ukf: Unscented kalman filters with gaussian process prediction and observation models,” in2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct. 2007, p. 1901–1907. [Online]. Available: https://ieeexplore.ieee.org/document/4399284/?arnumber=4399284 3

  28. [28]

    A survey of approximate quantile compu- tation on large-scale data (technical report),

    Z. Chen and A. Zhang, “A survey of approximate quantile compu- tation on large-scale data (technical report),”IEEE Access, vol. 8, p. 34585–34597, 2020, arXiv:2004.08255 [cs]. 3

  29. [29]

    Quantile propagation for wasserstein-approximate gaussian processes,

    R. Zhang, C. J. Walder, E. V . Bonilla, M.-A. Rizoiu, and L. Xie, “Quantile propagation for wasserstein-approximate gaussian processes,”

  30. [30]

    Available: https://arxiv.org/abs/1912.10200 3

    [Online]. Available: https://arxiv.org/abs/1912.10200 3

  31. [31]

    Deep reinforcement learning in a handful of trials using probabilistic dynamics models,

    K. Chua, R. Calandra, R. McAllister, and S. Levine, “Deep reinforcement learning in a handful of trials using probabilistic dynamics models,” inAdvances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/ 2018/hash/3de568f8597b94bda53149c7d7f5958c-Abstract.html 3

  32. [32]

    Learning terrain-aware kin- odynamic model for autonomous off-road rally driving with model predictive path integral control,

    H. Lee, T. Kim, J. Mun, and W. Lee, “Learning terrain-aware kin- odynamic model for autonomous off-road rally driving with model predictive path integral control,”IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7663–7670, 2023. 3

  33. [33]

    Bridging active exploration and uncertainty-aware deployment using probabilistic ensemble neural network dynamics,

    T. Kim, J. Mun, J. Seo, B. Kim, and S. Hong, “Bridging active exploration and uncertainty-aware deployment using probabilistic ensemble neural network dynamics,” no. arXiv:2305.12240, May 2023, arXiv:2305.12240. [Online]. Available: http://arxiv.org/abs/2305.12240 3

  34. [34]

    Risk-aware mppi for stochastic hybrid systems,

    H. Parwana, M. Black, B. Hoxha, H. Okamoto, G. Fainekos, D. Prokhorov, and D. Panagou, “Risk-aware mppi for stochastic hybrid systems,” no. arXiv:2411.09198, Nov. 2024, arXiv:2411.09198. [Online]. Available: http://arxiv.org/abs/2411.09198 3 JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 20

  35. [35]

    Distributionally robust optimization with unscented transform for learning-based motion control in dynamic environments,

    A. Hakobyan and I. Yang, “Distributionally robust optimization with unscented transform for learning-based motion control in dynamic environments,” in2023 IEEE International Conference on Robotics and Automation (ICRA), May 2023, p. 3225–3232. 3

  36. [36]

    Trajectory distribution control for model predictive path integral control using covariance steering,

    J. Yin, Z. Zhang, E. Theodorou, and P. Tsiotras, “Trajectory distribution control for model predictive path integral control using covariance steering,” in2022 International Conference on Robotics and Automation (ICRA), May 2022, p. 1478–1484. [Online]. Available: https://ieeexplore.ieee.org/document/9811615/?arnumber=9811615 3

  37. [37]

    Towards efficient mppi trajectory generation with unscented guidance: U-mppi control strategy,

    I. S. Mohamed, J. Xu, G. S. Sukhatme, and L. Liu, “Towards efficient mppi trajectory generation with unscented guidance: U-mppi control strategy,” no. arXiv:2306.12369, Oct. 2023, arXiv:2306.12369 [cs, eess]. [Online]. Available: http://arxiv.org/abs/2306.12369 3

  38. [38]

    Adaptive risk sensitive model predictive control with stochastic search,

    Z. Wang, O. So, K. Lee, and E. A. Theodorou, “Adaptive risk sensitive model predictive control with stochastic search,” p. 13. 3, 9

  39. [39]

    Risk-aware model predictive path integral control using conditional value-at-risk,

    J. Yin, Z. Zhang, and P. Tsiotras, “Risk-aware model predictive path integral control using conditional value-at-risk,” in2023 IEEE International Conference on Robotics and Automation (ICRA), May 2023, p. 7937–7943. [Online]. Available: https://ieeexplore.ieee.org/ document/10161100/?arnumber=10161100 3

  40. [40]

    Multistep Prediction of Dynamic Systems with Recurrent Neural Networks,

    N. Mohajerin and S. L. Waslander, “Multistep Prediction of Dynamic Systems with Recurrent Neural Networks,”IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11, pp. 3370–3383, 2019. 4

  41. [41]

    Model predictive control for aggressive driving over uneven terrain.arXiv preprint arXiv:2311.12284, 2023

    T. Han, A. Liu, A. Li, A. Spitzer, G. Shi, and B. Boots, “Model predictive control for aggressive driving over uneven terrain,” no. arXiv:2311.12284, Nov. 2023, arXiv:2311.12284 [cs]. [Online]. Available: http://arxiv.org/abs/2311.12284 6, 10

  42. [42]

    Step: Stochastic traversability evaluation and planning for risk-aware off-road navigation,

    D. D. Fan, K. Otsu, Y . Kubo, A. Dixit, J. Burdick, and A.-A. Agha- Mohammadi, “Step: Stochastic traversability evaluation and planning for risk-aware off-road navigation,” inRobotics: Science and Systems. RSS Foundation, 2021, pp. 1–21. 6

  43. [43]

    G-vom: A gpu accelerated voxel off-road mapping system,

    T. Overbye and S. Saripalli, “G-vom: A gpu accelerated voxel off-road mapping system,” in2022 IEEE Intelligent Vehicles Symposium (IV), 2022, pp. 1480–1486. 6, 10

  44. [44]

    Few-shot semantic learning for robust multi-biome 3d semantic mapping in off-road environments,

    D. Atha, X. Lei, S. Khattak, A. Sabel, E. Miller, A. Noca, G. Lim, J. Edlund, C. Padgett, and P. Spieler, “Few-shot semantic learning for robust multi-biome 3d semantic mapping in off-road environments,”

  45. [45]

    Available: https://arxiv.org/abs/2411.06632 6, 10, 13, 14

    [Online]. Available: https://arxiv.org/abs/2411.06632 6, 10, 13, 14

  46. [46]

    Automotive applications of explicit non-linear model predictive control,

    M. Metzler, “Automotive applications of explicit non-linear model predictive control,” Ph.D. dissertation, 11 2020. 6

  47. [47]

    Longitudinal vehicle dynamics using simulink/matlab,

    P. Shakouri, A. Ordys, M. Askari, and D. S. Laila, “Longitudinal vehicle dynamics using simulink/matlab,” vol. 2010, 01 2010, pp. 1–6. 6

  48. [48]

    The unscented kalman filter for nonlinear estimation,

    E. Wan and R. Van Der Merwe, “The unscented kalman filter for nonlinear estimation,” inProceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium (Cat. No.00EX373). Lake Louise, Alta., Canada: IEEE, 2000, p. 153–158. [Online]. Available: http://ieeexplore.ieee.org/ document/882463/ 7

  49. [49]

    Choleski-banachiewicz approach to systems with non-positive definite matrices with mathematica®,

    R. A. Walenty ´nski, “Choleski-banachiewicz approach to systems with non-positive definite matrices with mathematica®,” inComputational Science - ICCS 2004, M. Bubak, G. D. van Albada, P. M. A. Sloot, and J. Dongarra, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 311–318. 7

  50. [50]

    ROSE: Robust State Estimation via Online Covaraince Adaptation,

    S. Fakoorian, K. Otsu, S. Khattak, M. Palieri, and A. A. Agha- mohammadi, “ROSE: Robust State Estimation via Online Covaraince Adaptation,”International Foundation of Robotics Research (ISRR),

  51. [51]

    Graph-based multi-sensor fusion for consistent localization of autonomous construction robots,

    J. Nubert, S. Khattak, and M. Hutter, “Graph-based multi-sensor fusion for consistent localization of autonomous construction robots,” inIEEE International Conference on Robotics and Automation (ICRA). IEEE,

  52. [52]

    Pytorch: An imperative style, high-performance deep learning library,

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” inAdvances in Neural Information Processing Sy...

  53. [53]

    Adam: A method for stochastic optimization,

    D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”

  54. [54]

    Adam: A Method for Stochastic Optimization

    [Online]. Available: https://arxiv.org/abs/1412.6980 10 JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 21 VII. APPENDIX A. Additional Ablation Study Results The appendix includes additional box plots that are not focused on the MSE loss, to better underscore smaller points made in the results section. The main point to be made is that uncertain...

  55. [55]

    (b) Structure Changes 2 bline Cell True Input 50 100 150 200 250 300 350Engine RPM ↑672.3 ↑683.6 ↑698.1 306. 158. 230. 311. 160. 233. 318. 163. 238. (c) Structure Changes 3 bline FNN U FNN M FNN LSTM U LSTM M LSTM 50 100 150 200 250 300 350Engine RPM ↑672.3 ↑694.3 ↑671.9 ↑688.1 ↑693.5 ↑676.6 ↑697.9 306. 158. 230. 316. 163. 235. 306. 158. 228. 314. 162. 23...

  56. [56]

    (d) Architecture bline Meta D Meta ND Pred D Pred ND 50 100 150 200 250 300Engine RPM ↑672.3 ↑669.8 ↑674.7 ↑665.9 ↑675.7 306. 158. 230. 305. 156. 228. 307. 158. 229. 303. 156. 226. 307. 157. 230. (e) Initialization bline Less TA TA Full TA Delay TA Less 0 200 400 600 800Engine RPM ↑676.2 ↑676.6 ↑1170. ↑662.3 ↑676.4 ↑1704. 308. 158. 230. 308. 159. 230. 565...

  57. [57]

    (f) Loss Structure None 0.0s 0.1s 0.2s bline 0.3s 0.5s 1.0s 50 100 150 200 250 300 350Engine RPM ↑693.5 ↑686.1 ↑683.6 ↑672.3 ↑683.8 ↑687.2 ↑680.3 316. 162. 237. 312. 160. 233. 311. 159. 231. 306. 158. 230. 311. 158. 231. 312. 159. 232. 309. 159. 229. (g) Buffer History (τ) HW bline bline+ bline- bline fast 0 200 400 600 800 1000Engine RPM ↑740.3 ↑672.3 ↑1...

  58. [58]

    (h) Final Comparison Fig. 16. Box plots showinge(engine RPM) error at the end of the5strajectory for various structural changes to the networks. Whiskers are defined as ±1.5IQRand is given with the arrows, the green line defines the median and the orange the mean value. The baseline model (bline) is kept consistent. The explanations of the keys used in sh...