Trajectory Planning for Safe Dual Control with Active Exploration

Devansh R. Agrawal; Dimitra Panagou; Kaleb Ben Naveed; Manveer Singh

arxiv: 2604.15507 · v1 · submitted 2026-04-16 · 💻 cs.RO

Trajectory Planning for Safe Dual Control with Active Exploration

Kaleb Ben Naveed , Manveer Singh , Devansh R. Agrawal , Dimitra Panagou This is my paper

Pith reviewed 2026-05-10 10:18 UTC · model grok-4.3

classification 💻 cs.RO

keywords dual controlsafe trajectory planningactive explorationrobust planningmodel uncertaintyquadrotor navigationautonomous racingbudget-constrained planning

0 comments

The pith

Dual-gatekeeper framework enables safe active exploration in dual control by pursuing uncertainty reduction only under verifiable improvements that preserve safety and budget limits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Planning safe trajectories under model uncertainty requires balancing immediate task performance against long-term gains from reducing that uncertainty. Standard robust planners stay safe by assuming worst cases but become overly conservative, while most dual-control methods add exploration via tuned weights without formal checks on when it is worth doing. The paper introduces Dual-gatekeeper to enforce that exploration occurs only when it delivers a verifiable net improvement without violating safety constraints or a mission-level cost budget. This setup lets the system reduce uncertainty on the fly during nominal missions while keeping formal guarantees intact, as shown in quadrotor navigation and autonomous car racing examples under parametric uncertainty.

Core claim

We propose Dual-gatekeeper, a framework that integrates robust planning with active exploration under formal guarantees of safety and budget feasibility. The key idea is that exploration is pursued only when it provides a verifiable improvement without compromising safety or violating the budget, enabling the system to balance immediate task performance with long-term uncertainty reduction in a principled manner.

What carries the argument

Dual-gatekeeper, the decision layer that performs verifiable improvement checks before allowing exploration and integrates them with existing robust safety mechanisms.

If this is right

Exploration is added only when it demonstrably improves expected outcomes without degrading task performance beyond the allowed budget.
Formal safety guarantees remain intact because exploration never bypasses the underlying robust planner.
Two concrete implementations exist, each tied to a different safety mechanism, and both are tested on quadrotor navigation and car racing under parametric uncertainty.
The approach explicitly separates the decision of whether to explore from the low-level trajectory generation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same verifiable-check structure could be applied to other planning domains where uncertainty reduction competes with a hard resource limit, such as energy-constrained mobile robots.
If the improvement checks can be computed quickly, the method might allow tighter real-time margins than purely robust planners while still meeting safety specifications.
Extending the budget concept to include time or energy could produce versions suitable for long-horizon missions where learning must occur without starving the primary task.

Load-bearing premise

That exploration decisions can be made via verifiable checks that integrate with safety mechanisms without introducing new risks or budget violations.

What would settle it

A concrete counter-example in which the framework approves an exploration action that later causes a safety violation or budget overrun under the same uncertainty model.

Figures

Figures reproduced from arXiv: 2604.15507 by Devansh R. Agrawal, Dimitra Panagou, Kaleb Ben Naveed, Manveer Singh.

**Figure 1.** Figure 1: Predicted Cost Example safety constraints and an exploration budget that bounds the cumulative additional cost incurred by exploratory actions relative to the nominal mission behavior. Now we formulate the problem mathematically. We first define a trajectory: Definition 4 (Trajectory). Let T = [t0, tf ] ⊂ R. Let Π denote the set of admissible feedback policies π : R × X → U. A trajectory induced by a polic… view at source ↗

**Figure 2.** Figure 2: The mission objective, safety requirements, and uncertainty bounds are used by a robust trajectory planner to compute a conservative [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: The proposed framework at a glance. Starting from a conservative backup trajectory, multiple candidate trajectories are generated, [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Illustration of the gatekeeper instantiation. Starting from prerequisite policies—a backup policy and a nominal mission policy—the [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Backup, candidate, and final solution trajectories for Case Study 1 (top) and Case Study 2 (bottom). [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Parameter bound evolution are shown for two studies: Case Study 1 ( [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Trajectories over the last 5 laps for each method. For (a), the successful Trial 5 run is shown. Additional details on the trials can [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

read the original abstract

Planning safe trajectories under model uncertainty is a fundamental challenge. Robust planning ensures safety by considering worst-case realizations, yet ignores uncertainty reduction and leads to overly conservative behavior. Actively reducing uncertainty on-the-fly during a nominal mission defines the dual control problem. Most approaches address this by adding a weighted exploration term to the cost, tuned to trade off the nominal objective and uncertainty reduction, but without formal consideration of when exploration is beneficial. Moreover, safety is enforced in some methods but not in others. We study a budget-constrained dual control problem, where uncertainty is reduced subject to safety and a mission-level cost budget that limits the allowable degradation in task performance due to exploration. In this work, we propose Dual-gatekeeper, a framework that integrates robust planning with active exploration under formal guarantees of safety and budget feasibility. The key idea is that exploration is pursued only when it provides a verifiable improvement without compromising safety or violating the budget, enabling the system to balance immediate task performance with long-term uncertainty reduction in a principled manner. We provide two implementations of the framework based on different safety mechanisms and demonstrate its performance on quadrotor navigation and autonomous car racing case studies under parametric uncertainty.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Dual-gatekeeper, a framework for budget-constrained dual control that integrates robust planning (worst-case safety) with active exploration. Exploration occurs only when a verifiable improvement check confirms uncertainty reduction without safety compromise or violation of a mission-level cost budget limiting task-performance degradation. Two implementations based on different safety mechanisms are presented and evaluated on quadrotor navigation and autonomous car racing under parametric uncertainty, with claims of formal guarantees for safety and budget feasibility.

Significance. If the formal guarantees are rigorously established with consistent uncertainty modeling, the work offers a principled way to mitigate conservatism in robust planning while preserving performance bounds, which could influence safe exploration strategies in robotics. The budget-constrained formulation and case-study demonstrations are practical strengths; reproducible implementations and falsifiable safety/budget claims would further enhance impact.

major comments (2)

[§3] §3 (Framework and verifiable improvement): The central guarantee of budget feasibility requires that the verifiable improvement check employs the same set-based or min-max uncertainty model used for robust safety and budget bounding. If the check instead relies on nominal trajectories, expected information gain, or probabilistic variance reduction, a trajectory can satisfy the check yet produce task-cost degradation exceeding the budget under admissible uncertainty realizations. This mismatch must be resolved with an explicit equivalence proof or worst-case bound.
[§5] §5 (Case studies): In the quadrotor and car-racing examples under parametric uncertainty, exploration alters state distributions that the robust planner must still cover. The manuscript should include explicit verification (e.g., via additional worst-case simulations or tube-propagation analysis) that budget adherence holds for all admissible realizations, not merely nominal or average-case trajectories.

minor comments (2)

[Abstract] Abstract: The strong claim of 'formal guarantees' appears without a proof sketch, key theorem statement, or reference to the main condition; a one-sentence pointer to the central result would improve clarity.
[Notation] Notation and definitions: Ensure uniform terminology for uncertainty sets across the robust planner, budget constraint, and improvement check; inconsistent usage risks reader confusion about model compatibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help strengthen the formal presentation of our work. We address each major comment below.

read point-by-point responses

Referee: [§3] §3 (Framework and verifiable improvement): The central guarantee of budget feasibility requires that the verifiable improvement check employs the same set-based or min-max uncertainty model used for robust safety and budget bounding. If the check instead relies on nominal trajectories, expected information gain, or probabilistic variance reduction, a trajectory can satisfy the check yet produce task-cost degradation exceeding the budget under admissible uncertainty realizations. This mismatch must be resolved with an explicit equivalence proof or worst-case bound.

Authors: We agree that consistency between the verifiable improvement check and the robust uncertainty model is required to uphold the budget guarantees. In Dual-gatekeeper the check is formulated using the identical set-based representation as the robust planner and budget bounds, ensuring that any accepted exploration trajectory cannot exceed the budget under admissible realizations. To make this explicit, we will add a short equivalence proof and worst-case bound derivation to the revised §3. revision: yes
Referee: [§5] §5 (Case studies): In the quadrotor and car-racing examples under parametric uncertainty, exploration alters state distributions that the robust planner must still cover. The manuscript should include explicit verification (e.g., via additional worst-case simulations or tube-propagation analysis) that budget adherence holds for all admissible realizations, not merely nominal or average-case trajectories.

Authors: The referee correctly notes that the current case-study results emphasize nominal and average-case trajectories. While the framework itself provides worst-case guarantees, the empirical sections would be strengthened by direct verification over the full uncertainty sets. We will therefore augment both the quadrotor and car-racing studies with tube-propagation analysis and additional worst-case simulations to confirm budget adherence for all admissible realizations. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper proposes Dual-gatekeeper as an integration of existing robust planning methods with an added active exploration mechanism under explicit safety and budget constraints. The key claim—that exploration occurs only upon a verifiable improvement check—is introduced as an independent decision layer rather than a redefinition or tautological fit of the robust planner's outputs. No load-bearing steps reduce by construction to inputs via self-definition, fitted parameters renamed as predictions, or self-citation chains; the framework builds on prior dual-control concepts with added formal verification elements that retain independent content. Case studies demonstrate application without evidence that results are forced by the initial assumptions or uncertainty models.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated in the provided text. The budget constraint and safety mechanisms are treated as given inputs from prior robust-planning literature.

axioms (1)

domain assumption Robust planning mechanisms can be combined with active exploration while preserving formal safety guarantees.
Central to the Dual-gatekeeper integration described in the abstract.

pith-pipeline@v0.9.0 · 5512 in / 1241 out tokens · 26113 ms · 2026-05-10T10:18:21.197821+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 1 internal anchor

[1]

Enabling safety for aerial robots: Planning and control architectures,

K. B. Naveed, D. R. Agrawal, D. M. Cherenson, H. Lee, A. Gilbert, H. Parwana, V . S. Chipade, W. Bentz, and D. Panagou, “Enabling safety for aerial robots: Planning and control architectures,”arXiv preprint arXiv:2504.08601, 2025

work page arXiv 2025
[2]

Dynamic tube mpc for nonlinear systems,

B. T. Lopez, J.-J. E. Slotine, and J. P. How, “Dynamic tube mpc for nonlinear systems,” in2019 American Control Conference (ACC). IEEE, 2019, pp. 1655–1662

work page 2019
[3]

Risk-averse trajectory opti- mization via sample average approximation,

T. Lew, R. Bonalli, and M. Pavone, “Risk-averse trajectory opti- mization via sample average approximation,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1500–1507, 2023

work page 2023
[5]

Autonomous drone racing: A survey,

D. Hanover, A. Loquercio, L. Bauersfeld, A. Romero, R. Penicka, Y . Song, G. Cioffi, E. Kaufmann, and D. Scaramuzza, “Autonomous drone racing: A survey,”IEEE Transactions on Robotics, vol. 40, pp. 3044–3067, 2024

work page 2024
[6]

Learning model predictive control with error dynamics regression for autonomous racing,

H. Xue, E. L. Zhu, J. M. Dolan, and F. Borrelli, “Learning model predictive control with error dynamics regression for autonomous racing,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 13 250–13 256

work page 2024
[7]

Lla- mpc: Fast adaptive control for autonomous racing,

M. F. AL-Sunni, H. Almubarak, K. Horng, and J. M. Dolan, “Lla- mpc: Fast adaptive control for autonomous racing,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 1969–1976

work page 2025
[8]

MPCC++: Model Predictive Contouring Control for Time-Optimal Flight with Safety Constraints,

M. Krinner, A. Romero, L. Bauersfeld, M. Zeilinger, A. Carron, and D. Scaramuzza, “MPCC++: Model Predictive Contouring Control for Time-Optimal Flight with Safety Constraints,” inProceedings of Robotics: Science and Systems, Delft, Netherlands, July 2024

work page 2024
[9]

gatekeeper: Online safety verification and control for nonlinear systems in dynamic environ- ments,

D. R. Agrawal, R. Chen, and D. Panagou, “gatekeeper: Online safety verification and control for nonlinear systems in dynamic environ- ments,”IEEE Transactions on Robotics, 2024

work page 2024
[10]

A proba- bilistic particle-control approximation of chance-constrained stochastic predictive control,

L. Blackmore, M. Ono, A. Bektassov, and B. C. Williams, “A proba- bilistic particle-control approximation of chance-constrained stochastic predictive control,”IEEE transactions on Robotics, vol. 26, no. 3, pp. 502–517, 2010

work page 2010
[11]

Robust control barrier and control lyapunov functions with fixed-time convergence guarantees,

K. Garg and D. Panagou, “Robust control barrier and control lyapunov functions with fixed-time convergence guarantees,” in2021 American Control Conference (ACC), 2021, pp. 2292–2297

work page 2021
[12]

Robust control barrier–value functions for safety-critical control,

J. J. Choi, D. Lee, K. Sreenath, C. J. Tomlin, and S. L. Herbert, “Robust control barrier–value functions for safety-critical control,” in 2021 60th IEEE Conference on Decision and Control (CDC), 2021, pp. 6814–6821

work page 2021
[13]

Robust online motion planning via contraction theory and convex optimization,

S. Singh, A. Majumdar, J.-J. Slotine, and M. Pavone, “Robust online motion planning via contraction theory and convex optimization,” in 2017 IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 5883–5890

work page 2017
[14]

Theory of dual control,

A. Feldbaum, “Theory of dual control,”Autom. Remote Control, vol. 22, no. 1, pp. 3–19, 1961

work page 1961
[15]

Dual adaptive mpc using an exact set-membership reformulation,

A. Parsi, D. Liu, A. Iannelli, and R. S. Smith, “Dual adaptive mpc using an exact set-membership reformulation,”IFAC-PapersOnLine, vol. 56, no. 2, pp. 8457–8463, 2023

work page 2023
[16]

Dual Online Stein Variational Inference for Control and Dynamics,

L. Barcelos, A. Lambert, R. Oliveira, P. Borges, B. Boots, and F. Ramos, “Dual Online Stein Variational Inference for Control and Dynamics,” inProceedings of Robotics: Science and Systems, 2021

work page 2021
[17]

Act as you learn: Adaptive decision-making in non-stationary markov decision processes,

B. Luo, Y . Zhang, A. Dubey, and A. Mukhopadhyay, “Act as you learn: Adaptive decision-making in non-stationary markov decision processes,” inProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024, pp. 1301–1309

work page 2024
[18]

Robust adaptive mpc using control contraction metrics,

A. Sasfi, M. N. Zeilinger, and J. K ¨ohler, “Robust adaptive mpc using control contraction metrics,”Automatica, vol. 155, p. 111169, 2023

work page 2023
[19]

Active uncertainty reduction for safe and efficient interaction planning: A shielding-aware dual control approach,

H. Hu, D. Isele, S. Bae, and J. F. Fisac, “Active uncertainty reduction for safe and efficient interaction planning: A shielding-aware dual control approach,”The International Journal of Robotics Research, vol. 43, no. 9, pp. 1382–1408, 2024

work page 2024
[20]

Computationally efficient system level tube-mpc for uncertain systems,

J. Sieber, A. Didier, and M. N. Zeilinger, “Computationally efficient system level tube-mpc for uncertain systems,”Automatica, vol. 180, p. 112466, 2025

work page 2025
[21]

Dynamic tube mpc: Learning tube dynamics with massively parallel simulation for robust safety in practice,

W. D. Compton, N. Csomay-Shanklin, C. Johnson, and A. D. Ames, “Dynamic tube mpc: Learning tube dynamics with massively parallel simulation for robust safety in practice,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 2613–2619

work page 2025
[22]

Sample average approximation for stochastic programming with equality constraints,

T. Lew, R. Bonalli, and M. Pavone, “Sample average approximation for stochastic programming with equality constraints,”SIAM Journal on Optimization, vol. 34, no. 4, pp. 3506–3533, 2024

work page 2024
[23]

Robust control barrier–value functions for safety-critical control,

J. J. Choi, D. Lee, K. Sreenath, C. J. Tomlin, and S. L. Herbert, “Robust control barrier–value functions for safety-critical control,” in 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021, pp. 6814–6821

work page 2021
[24]

Safety on the fly: Constructing robust safety filters via policy control barrier functions at runtime,

L. Knoedler, O. So, J. Yin, M. Black, Z. Serlin, P. Tsiotras, J. Alonso- Mora, and C. Fan, “Safety on the fly: Constructing robust safety filters via policy control barrier functions at runtime,”IEEE Robotics and Automation Letters, 2025

work page 2025
[25]

Safe active dynamics learning and control: A sequential exploration–exploitation framework,

T. Lew, A. Sharma, J. Harrison, A. Bylard, and M. Pavone, “Safe active dynamics learning and control: A sequential exploration–exploitation framework,”IEEE Transactions on Robotics, vol. 38, no. 5, pp. 2888– 2907, 2022

work page 2022
[26]

Active exploration in adaptive model predictive control,

A. Parsi, A. Iannelli, and R. S. Smith, “Active exploration in adaptive model predictive control,” in2020 59th IEEE Conference on Decision and Control (CDC). IEEE, 2020, pp. 6186–6191

work page 2020
[27]

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics,

T. Kim, J. Mun, J. Seo, B. Kim, and S. Hong, “Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics,” inProceedings of Robotics: Science and Systems, Daegu, Republic of Korea, July 2023

work page 2023
[28]

First, learn what you don’t know: Active information gathering for driving at the limits of handling,

A. Davydov, F. Djeumou, M. Greiff, M. Suminaka, M. Thompson, J. Subosits, and T. Lew, “First, learn what you don’t know: Active information gathering for driving at the limits of handling,”IEEE Robotics and Automation Letters, 2025

work page 2025
[29]

Safely: safe stochastic motion planning under constrained sensing via duality,

M. Hibbard, A. P. Vinod, J. Quattrociocchi, and U. Topcu, “Safely: safe stochastic motion planning under constrained sensing via duality,” IEEE Transactions on Robotics, vol. 39, no. 5, pp. 3464–3478, 2023

work page 2023
[30]

Safe guaranteed exploration for non-linear systems,

M. Prajapat, J. K ¨ohler, M. Turchetta, A. Krause, and M. N. Zeilinger, “Safe guaranteed exploration for non-linear systems,”IEEE Transac- tions on Automatic Control, 2025

work page 2025
[31]

Stochastic model predictive control with active uncer- tainty learning: A survey on dual control,

A. Mesbah, “Stochastic model predictive control with active uncer- tainty learning: A survey on dual control,”Annual Reviews in Control, vol. 45, pp. 107–117, 2018

work page 2018
[32]

A dual control perspective for exploration and exploitation in autonomous search,

Z. Li, W.-H. Chen, and J. Yang, “A dual control perspective for exploration and exploitation in autonomous search,” in2022 European Control Conference (ECC). IEEE, 2022, pp. 1876–1881

work page 2022
[33]

Dual stochastic mpc for systems with parametric and structural uncertainty,

E. Arcari, L. Hewing, M. Schlichting, and M. Zeilinger, “Dual stochastic mpc for systems with parametric and structural uncertainty,” inLearning for Dynamics and Control. PMLR, 2020, pp. 894–903

work page 2020
[34]

Adaptive dual covariance steering with active parameter estimation,

J. W. Knaup and P. Tsiotras, “Adaptive dual covariance steering with active parameter estimation,” in2024 IEEE 63rd Conference on Decision and Control (CDC). IEEE, 2024, pp. 659–664

work page 2024
[35]

Implicit dual-control for visibility-aware navigation in unstructured environments,

B. Johnson, Q. Zhu, R. Prucka, M. Barron, M. Figueroa-Santos, and M. Castanier, “Implicit dual-control for visibility-aware navigation in unstructured environments,”arXiv preprint arXiv:2507.04371, 2025

work page arXiv 2025
[36]

Augmenting mpc schemes with active learning: Intuitive tuning and guaranteed performance,

R. Soloperto, J. K ¨ohler, and F. Allg ¨ower, “Augmenting mpc schemes with active learning: Intuitive tuning and guaranteed performance,” IEEE Control Systems Letters, vol. 4, no. 3, pp. 713–718, 2020

work page 2020
[37]

Provably-Safe, Online System Identification ,

B. Zhang, Z. Zhou, and R. Vasudevan, “Provably-Safe, Online System Identification ,” inProceedings of Robotics: Science and Systems, Los Angeles, CA, USA, July 2025

work page 2025
[38]

Adaptive robust model predictive control for nonlinear systems,

B. T. Lopez, “Adaptive robust model predictive control for nonlinear systems,” Ph.D. dissertation, MIT, 2019

work page 2019
[39]

Improving exploration in actor–critic with weakly pessimistic value estimation and optimistic policy optimization,

F. Li, M. Fu, W. Chen, F. Zhang, H. Zhang, H. Qu, and Z. Yi, “Improving exploration in actor–critic with weakly pessimistic value estimation and optimistic policy optimization,”IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 7, pp. 8783–8796, 2024

work page 2024
[40]

Exploring pessimism and optimism dynamics in deep reinforcement learning,

B. Tasdighi, N. Werge, Y .-S. Wu, and M. Kandemir, “Exploring pessimism and optimism dynamics in deep reinforcement learning,” inSeventeenth European Workshop on Reinforcement Learning, 2024

work page 2024
[41]

Tactical optimism and pessimism for deep reinforcement learn- ing,

T. Moskovitz, J. Parker-Holder, A. Pacchiano, M. Arbel, and M. Jor- dan, “Tactical optimism and pessimism for deep reinforcement learn- ing,”Advances in Neural Information Processing Systems, vol. 34, pp. 12 849–12 863, 2021

work page 2021
[42]

A formal gate- keeper framework for safe dual control with active exploration,

K. B. Naveed, D. R. Agrawal, and D. Panagou, “A formal gate- keeper framework for safe dual control with active exploration,”arXiv preprint arXiv:2510.06351, 2025

work page internal anchor Pith review arXiv 2025
[43]

Cohen and C

M. Cohen and C. Belta,Adaptive Control Lyapunov Functions. Cham: Springer International Publishing, 2023, pp. 57–76. [Online]. Available: https://doi.org/10.1007/978-3-031-29310-8 4

work page doi:10.1007/978-3-031-29310-8 2023
[44]

Persistent excitation in adaptive systems,

K. S. Narendra and A. M. Annaswamy, “Persistent excitation in adaptive systems,”International Journal of Control, vol. 45, no. 1, pp. 127–160, 1987

work page 1987
[45]

Robust safety-critical control for systems with actuation uncertainty,

M. Cohen and C. Belta, “Robust safety-critical control for systems with actuation uncertainty,” inAdaptive and Learning-Based Control of Safety-Critical Systems. Springer, 2023, pp. 117–131

work page 2023
[46]

Set membership identification of non- linear systems,

M. Milanese and C. Novara, “Set membership identification of non- linear systems,”Automatica, vol. 40, no. 6, pp. 957–975, 2004

work page 2004
[47]

How should a robot assess risk? towards an axiomatic theory of risk in robotics,

A. Majumdar and M. Pavone, “How should a robot assess risk? towards an axiomatic theory of risk in robotics,” inRobotics Research: The 18th International Symposium ISRR. Springer, 2019, pp. 75–84. APPENDIX A. Interpretation of the Predicted Width This subsection interprets the predicted uncertainty width by relating it to the width obtained from realized ...

work page 2019
[48]

Dynamic Bicycle Model:The vehicle is modeled using a planar dynamic bicycle model with uncertain tire-road friction. The state is x= h px py ψ v x vy ω δ i⊤ ∈R 7,(99) where(p x, py)is the global position,ψthe yaw angle,v x andv y the body-frame longitudinal and lateral velocities,ω the yaw rate, andδthe steering angle. The control input is u= h Fd Fb ˙δcm...

work page
[49]

Linear in Parameter Form:For the uncertainty reduc- tion module, the model is written in linear-in-parameter form as ˙x=f0(x) +g 0(x)u+ Φ(x)µ+w(t),(107) whereΦ(x)is the regressor associated with the friction parameter andw(t)is a bounded disturbance. Defining the lateral tire forces without the friction coefficient as ¯Fy,f (x) =F z,f sin Cf tan−1(Bf αf(x...

work page
[50]

Nominal MPC Planner:The fallback policy corre- sponds to a conservative controller that follows the track centerline using a pure pursuit strategy with reduced speed, providing a safe fallback behavior that maintains large safety margins with respect to track boundaries. The nominal MPC solves min x0:N ,u0:N−1 N−1X k=0 ℓ(xk, uk,∆u k) +ℓ N(xN), (111) with ...

work page

[1] [1]

Enabling safety for aerial robots: Planning and control architectures,

K. B. Naveed, D. R. Agrawal, D. M. Cherenson, H. Lee, A. Gilbert, H. Parwana, V . S. Chipade, W. Bentz, and D. Panagou, “Enabling safety for aerial robots: Planning and control architectures,”arXiv preprint arXiv:2504.08601, 2025

work page arXiv 2025

[2] [2]

Dynamic tube mpc for nonlinear systems,

B. T. Lopez, J.-J. E. Slotine, and J. P. How, “Dynamic tube mpc for nonlinear systems,” in2019 American Control Conference (ACC). IEEE, 2019, pp. 1655–1662

work page 2019

[3] [3]

Risk-averse trajectory opti- mization via sample average approximation,

T. Lew, R. Bonalli, and M. Pavone, “Risk-averse trajectory opti- mization via sample average approximation,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1500–1507, 2023

work page 2023

[4] [5]

Autonomous drone racing: A survey,

D. Hanover, A. Loquercio, L. Bauersfeld, A. Romero, R. Penicka, Y . Song, G. Cioffi, E. Kaufmann, and D. Scaramuzza, “Autonomous drone racing: A survey,”IEEE Transactions on Robotics, vol. 40, pp. 3044–3067, 2024

work page 2024

[5] [6]

Learning model predictive control with error dynamics regression for autonomous racing,

H. Xue, E. L. Zhu, J. M. Dolan, and F. Borrelli, “Learning model predictive control with error dynamics regression for autonomous racing,” in2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 13 250–13 256

work page 2024

[6] [7]

Lla- mpc: Fast adaptive control for autonomous racing,

M. F. AL-Sunni, H. Almubarak, K. Horng, and J. M. Dolan, “Lla- mpc: Fast adaptive control for autonomous racing,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 1969–1976

work page 2025

[7] [8]

MPCC++: Model Predictive Contouring Control for Time-Optimal Flight with Safety Constraints,

M. Krinner, A. Romero, L. Bauersfeld, M. Zeilinger, A. Carron, and D. Scaramuzza, “MPCC++: Model Predictive Contouring Control for Time-Optimal Flight with Safety Constraints,” inProceedings of Robotics: Science and Systems, Delft, Netherlands, July 2024

work page 2024

[8] [9]

gatekeeper: Online safety verification and control for nonlinear systems in dynamic environ- ments,

D. R. Agrawal, R. Chen, and D. Panagou, “gatekeeper: Online safety verification and control for nonlinear systems in dynamic environ- ments,”IEEE Transactions on Robotics, 2024

work page 2024

[9] [10]

A proba- bilistic particle-control approximation of chance-constrained stochastic predictive control,

L. Blackmore, M. Ono, A. Bektassov, and B. C. Williams, “A proba- bilistic particle-control approximation of chance-constrained stochastic predictive control,”IEEE transactions on Robotics, vol. 26, no. 3, pp. 502–517, 2010

work page 2010

[10] [11]

Robust control barrier and control lyapunov functions with fixed-time convergence guarantees,

K. Garg and D. Panagou, “Robust control barrier and control lyapunov functions with fixed-time convergence guarantees,” in2021 American Control Conference (ACC), 2021, pp. 2292–2297

work page 2021

[11] [12]

Robust control barrier–value functions for safety-critical control,

J. J. Choi, D. Lee, K. Sreenath, C. J. Tomlin, and S. L. Herbert, “Robust control barrier–value functions for safety-critical control,” in 2021 60th IEEE Conference on Decision and Control (CDC), 2021, pp. 6814–6821

work page 2021

[12] [13]

Robust online motion planning via contraction theory and convex optimization,

S. Singh, A. Majumdar, J.-J. Slotine, and M. Pavone, “Robust online motion planning via contraction theory and convex optimization,” in 2017 IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 5883–5890

work page 2017

[13] [14]

Theory of dual control,

A. Feldbaum, “Theory of dual control,”Autom. Remote Control, vol. 22, no. 1, pp. 3–19, 1961

work page 1961

[14] [15]

Dual adaptive mpc using an exact set-membership reformulation,

A. Parsi, D. Liu, A. Iannelli, and R. S. Smith, “Dual adaptive mpc using an exact set-membership reformulation,”IFAC-PapersOnLine, vol. 56, no. 2, pp. 8457–8463, 2023

work page 2023

[15] [16]

Dual Online Stein Variational Inference for Control and Dynamics,

L. Barcelos, A. Lambert, R. Oliveira, P. Borges, B. Boots, and F. Ramos, “Dual Online Stein Variational Inference for Control and Dynamics,” inProceedings of Robotics: Science and Systems, 2021

work page 2021

[16] [17]

Act as you learn: Adaptive decision-making in non-stationary markov decision processes,

B. Luo, Y . Zhang, A. Dubey, and A. Mukhopadhyay, “Act as you learn: Adaptive decision-making in non-stationary markov decision processes,” inProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024, pp. 1301–1309

work page 2024

[17] [18]

Robust adaptive mpc using control contraction metrics,

A. Sasfi, M. N. Zeilinger, and J. K ¨ohler, “Robust adaptive mpc using control contraction metrics,”Automatica, vol. 155, p. 111169, 2023

work page 2023

[18] [19]

Active uncertainty reduction for safe and efficient interaction planning: A shielding-aware dual control approach,

H. Hu, D. Isele, S. Bae, and J. F. Fisac, “Active uncertainty reduction for safe and efficient interaction planning: A shielding-aware dual control approach,”The International Journal of Robotics Research, vol. 43, no. 9, pp. 1382–1408, 2024

work page 2024

[19] [20]

Computationally efficient system level tube-mpc for uncertain systems,

J. Sieber, A. Didier, and M. N. Zeilinger, “Computationally efficient system level tube-mpc for uncertain systems,”Automatica, vol. 180, p. 112466, 2025

work page 2025

[20] [21]

Dynamic tube mpc: Learning tube dynamics with massively parallel simulation for robust safety in practice,

W. D. Compton, N. Csomay-Shanklin, C. Johnson, and A. D. Ames, “Dynamic tube mpc: Learning tube dynamics with massively parallel simulation for robust safety in practice,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 2613–2619

work page 2025

[21] [22]

Sample average approximation for stochastic programming with equality constraints,

T. Lew, R. Bonalli, and M. Pavone, “Sample average approximation for stochastic programming with equality constraints,”SIAM Journal on Optimization, vol. 34, no. 4, pp. 3506–3533, 2024

work page 2024

[22] [23]

Robust control barrier–value functions for safety-critical control,

J. J. Choi, D. Lee, K. Sreenath, C. J. Tomlin, and S. L. Herbert, “Robust control barrier–value functions for safety-critical control,” in 2021 60th IEEE Conference on Decision and Control (CDC). IEEE, 2021, pp. 6814–6821

work page 2021

[23] [24]

Safety on the fly: Constructing robust safety filters via policy control barrier functions at runtime,

L. Knoedler, O. So, J. Yin, M. Black, Z. Serlin, P. Tsiotras, J. Alonso- Mora, and C. Fan, “Safety on the fly: Constructing robust safety filters via policy control barrier functions at runtime,”IEEE Robotics and Automation Letters, 2025

work page 2025

[24] [25]

Safe active dynamics learning and control: A sequential exploration–exploitation framework,

T. Lew, A. Sharma, J. Harrison, A. Bylard, and M. Pavone, “Safe active dynamics learning and control: A sequential exploration–exploitation framework,”IEEE Transactions on Robotics, vol. 38, no. 5, pp. 2888– 2907, 2022

work page 2022

[25] [26]

Active exploration in adaptive model predictive control,

A. Parsi, A. Iannelli, and R. S. Smith, “Active exploration in adaptive model predictive control,” in2020 59th IEEE Conference on Decision and Control (CDC). IEEE, 2020, pp. 6186–6191

work page 2020

[26] [27]

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics,

T. Kim, J. Mun, J. Seo, B. Kim, and S. Hong, “Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics,” inProceedings of Robotics: Science and Systems, Daegu, Republic of Korea, July 2023

work page 2023

[27] [28]

First, learn what you don’t know: Active information gathering for driving at the limits of handling,

A. Davydov, F. Djeumou, M. Greiff, M. Suminaka, M. Thompson, J. Subosits, and T. Lew, “First, learn what you don’t know: Active information gathering for driving at the limits of handling,”IEEE Robotics and Automation Letters, 2025

work page 2025

[28] [29]

Safely: safe stochastic motion planning under constrained sensing via duality,

M. Hibbard, A. P. Vinod, J. Quattrociocchi, and U. Topcu, “Safely: safe stochastic motion planning under constrained sensing via duality,” IEEE Transactions on Robotics, vol. 39, no. 5, pp. 3464–3478, 2023

work page 2023

[29] [30]

Safe guaranteed exploration for non-linear systems,

M. Prajapat, J. K ¨ohler, M. Turchetta, A. Krause, and M. N. Zeilinger, “Safe guaranteed exploration for non-linear systems,”IEEE Transac- tions on Automatic Control, 2025

work page 2025

[30] [31]

Stochastic model predictive control with active uncer- tainty learning: A survey on dual control,

A. Mesbah, “Stochastic model predictive control with active uncer- tainty learning: A survey on dual control,”Annual Reviews in Control, vol. 45, pp. 107–117, 2018

work page 2018

[31] [32]

A dual control perspective for exploration and exploitation in autonomous search,

Z. Li, W.-H. Chen, and J. Yang, “A dual control perspective for exploration and exploitation in autonomous search,” in2022 European Control Conference (ECC). IEEE, 2022, pp. 1876–1881

work page 2022

[32] [33]

Dual stochastic mpc for systems with parametric and structural uncertainty,

E. Arcari, L. Hewing, M. Schlichting, and M. Zeilinger, “Dual stochastic mpc for systems with parametric and structural uncertainty,” inLearning for Dynamics and Control. PMLR, 2020, pp. 894–903

work page 2020

[33] [34]

Adaptive dual covariance steering with active parameter estimation,

J. W. Knaup and P. Tsiotras, “Adaptive dual covariance steering with active parameter estimation,” in2024 IEEE 63rd Conference on Decision and Control (CDC). IEEE, 2024, pp. 659–664

work page 2024

[34] [35]

Implicit dual-control for visibility-aware navigation in unstructured environments,

B. Johnson, Q. Zhu, R. Prucka, M. Barron, M. Figueroa-Santos, and M. Castanier, “Implicit dual-control for visibility-aware navigation in unstructured environments,”arXiv preprint arXiv:2507.04371, 2025

work page arXiv 2025

[35] [36]

Augmenting mpc schemes with active learning: Intuitive tuning and guaranteed performance,

R. Soloperto, J. K ¨ohler, and F. Allg ¨ower, “Augmenting mpc schemes with active learning: Intuitive tuning and guaranteed performance,” IEEE Control Systems Letters, vol. 4, no. 3, pp. 713–718, 2020

work page 2020

[36] [37]

Provably-Safe, Online System Identification ,

B. Zhang, Z. Zhou, and R. Vasudevan, “Provably-Safe, Online System Identification ,” inProceedings of Robotics: Science and Systems, Los Angeles, CA, USA, July 2025

work page 2025

[37] [38]

Adaptive robust model predictive control for nonlinear systems,

B. T. Lopez, “Adaptive robust model predictive control for nonlinear systems,” Ph.D. dissertation, MIT, 2019

work page 2019

[38] [39]

Improving exploration in actor–critic with weakly pessimistic value estimation and optimistic policy optimization,

F. Li, M. Fu, W. Chen, F. Zhang, H. Zhang, H. Qu, and Z. Yi, “Improving exploration in actor–critic with weakly pessimistic value estimation and optimistic policy optimization,”IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 7, pp. 8783–8796, 2024

work page 2024

[39] [40]

Exploring pessimism and optimism dynamics in deep reinforcement learning,

B. Tasdighi, N. Werge, Y .-S. Wu, and M. Kandemir, “Exploring pessimism and optimism dynamics in deep reinforcement learning,” inSeventeenth European Workshop on Reinforcement Learning, 2024

work page 2024

[40] [41]

Tactical optimism and pessimism for deep reinforcement learn- ing,

T. Moskovitz, J. Parker-Holder, A. Pacchiano, M. Arbel, and M. Jor- dan, “Tactical optimism and pessimism for deep reinforcement learn- ing,”Advances in Neural Information Processing Systems, vol. 34, pp. 12 849–12 863, 2021

work page 2021

[41] [42]

A formal gate- keeper framework for safe dual control with active exploration,

K. B. Naveed, D. R. Agrawal, and D. Panagou, “A formal gate- keeper framework for safe dual control with active exploration,”arXiv preprint arXiv:2510.06351, 2025

work page internal anchor Pith review arXiv 2025

[42] [43]

Cohen and C

M. Cohen and C. Belta,Adaptive Control Lyapunov Functions. Cham: Springer International Publishing, 2023, pp. 57–76. [Online]. Available: https://doi.org/10.1007/978-3-031-29310-8 4

work page doi:10.1007/978-3-031-29310-8 2023

[43] [44]

Persistent excitation in adaptive systems,

K. S. Narendra and A. M. Annaswamy, “Persistent excitation in adaptive systems,”International Journal of Control, vol. 45, no. 1, pp. 127–160, 1987

work page 1987

[44] [45]

Robust safety-critical control for systems with actuation uncertainty,

M. Cohen and C. Belta, “Robust safety-critical control for systems with actuation uncertainty,” inAdaptive and Learning-Based Control of Safety-Critical Systems. Springer, 2023, pp. 117–131

work page 2023

[45] [46]

Set membership identification of non- linear systems,

M. Milanese and C. Novara, “Set membership identification of non- linear systems,”Automatica, vol. 40, no. 6, pp. 957–975, 2004

work page 2004

[46] [47]

How should a robot assess risk? towards an axiomatic theory of risk in robotics,

A. Majumdar and M. Pavone, “How should a robot assess risk? towards an axiomatic theory of risk in robotics,” inRobotics Research: The 18th International Symposium ISRR. Springer, 2019, pp. 75–84. APPENDIX A. Interpretation of the Predicted Width This subsection interprets the predicted uncertainty width by relating it to the width obtained from realized ...

work page 2019

[47] [48]

Dynamic Bicycle Model:The vehicle is modeled using a planar dynamic bicycle model with uncertain tire-road friction. The state is x= h px py ψ v x vy ω δ i⊤ ∈R 7,(99) where(p x, py)is the global position,ψthe yaw angle,v x andv y the body-frame longitudinal and lateral velocities,ω the yaw rate, andδthe steering angle. The control input is u= h Fd Fb ˙δcm...

work page

[48] [49]

Linear in Parameter Form:For the uncertainty reduc- tion module, the model is written in linear-in-parameter form as ˙x=f0(x) +g 0(x)u+ Φ(x)µ+w(t),(107) whereΦ(x)is the regressor associated with the friction parameter andw(t)is a bounded disturbance. Defining the lateral tire forces without the friction coefficient as ¯Fy,f (x) =F z,f sin Cf tan−1(Bf αf(x...

work page

[49] [50]

Nominal MPC Planner:The fallback policy corre- sponds to a conservative controller that follows the track centerline using a pure pursuit strategy with reduced speed, providing a safe fallback behavior that maintains large safety margins with respect to track boundaries. The nominal MPC solves min x0:N ,u0:N−1 N−1X k=0 ℓ(xk, uk,∆u k) +ℓ N(xN), (111) with ...

work page