Cost-Aware Adaptive Conformal Inference for Runtime Assurance in Dynamic Environments

Bai Xue; Jingduo Pan; Luke Ong; Taoran Wu

arxiv: 2605.24463 · v1 · pith:FDZ62UWYnew · submitted 2026-05-23 · 📡 eess.SY · cs.SY

Cost-Aware Adaptive Conformal Inference for Runtime Assurance in Dynamic Environments

Taoran Wu , Jingduo Pan , Luke Ong , Bai Xue This is my paper

Pith reviewed 2026-06-30 13:17 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords conformal inferenceadaptive conformalruntime assurancecost-awareviolation costdynamic environmentsstatistical guaranteecontrol synthesis

0 comments

The pith

Cost-aware conformal inference bounds both violation frequency and cumulative harm.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Cost-Aware Adaptive Conformal Inference, which folds violation costs into the adaptation rule so that prediction sets widen in proportion to how harmful a miscoverage would be. This produces simultaneous long-run bounds on the average rate of violations and on the total accumulated cost of those violations, even when the underlying data distribution shifts over time. A reader would care because standard conformal methods control only one of those quantities, leaving open the possibility that rare but expensive failures accumulate unacceptable harm. The method is then embedded in a model-free control loop that trades off task performance against these two risk measures.

Core claim

Cost-Aware Adaptive Conformal Inference uses a loss function that multiplies the usual miscoverage indicator by the realized violation cost; the resulting score sequence is fed into the standard adaptive conformal update, yielding a dual guarantee that the long-run fraction of violations stays below a target and the long-run average cost per step stays below a second target, all without knowledge of the time-varying distribution.

What carries the argument

Cost-aware loss function that multiplies the miscoverage indicator by the violation cost.

If this is right

The controller expands sets more aggressively precisely when violations would be costly and keeps them tight otherwise.
The closed-loop system balances task performance against both reliability and total harm without an explicit plant model.
Prediction-set size automatically reflects severity rather than treating every violation as equal.
The same guarantee holds for any sequence of cost functions provided the costs remain non-negative and bounded.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same weighting idea could be tried inside other adaptive inference schemes that currently track only frequency.
In safety-critical applications the cumulative-cost bound supplies a direct handle on expected total harm over a mission horizon.
One could test whether the dual guarantee remains intact when costs themselves are estimated from data rather than observed exactly.

Load-bearing premise

Weighting the miscoverage indicator by violation costs inside the conformal adaptation rule still produces valid statistical guarantees when the data distribution changes over time.

What would settle it

An experiment in which, under a known non-stationary distribution, either the long-run violation frequency exceeds its target or the cumulative violation cost exceeds its target while the algorithm is running.

Figures

Figures reproduced from arXiv: 2605.24463 by Bai Xue, Jingduo Pan, Luke Ong, Taoran Wu.

**Figure 2.** Figure 2: Comparison of experimental results (lower is better for all three metrics) [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Effect of Sensitivity β and W Effect of Learning Rate γ [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

read the original abstract

This paper addresses the problem of providing runtime assurance for systems operating online under unknown and potentially time-varying data distributions. We propose Cost-Aware Adaptive Conformal Inference (ACI), a novel framework that incorporates constraint violation costs directly into the conformal adaptation mechanism. Our key insight is that uncertainty margins should adapt not only to the frequency of constraint violations but also to their severity. We formalize this through a cost-aware loss function that couples the miscoverage indicator with violation costs. Unlike existing methods that regulate a single controlled metric, our approach provides a dual statistical guarantee: simultaneously bounding the long-run average violation frequencies (reliability) and cumulative violation cost (harm). By weighting prediction failures according to their severity, the algorithm enables the controller to respond proportionally to violation severity, expanding prediction sets aggressively when necessary while maintaining efficiency during nominal operation. We integrate Cost-Aware ACI into a robust control synthesis framework, creating a closed-loop system that balances task performance with runtime risk control without requiring explicit model knowledge. Experiments validate its effectiveness for online risk-aware controller synthesis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Cost-aware ACI weights the adaptation loss by violation cost but the dual guarantee on both unweighted frequency and weighted cost does not follow without extra conditions.

read the letter

The paper's main move is to replace the standard miscoverage loss in adaptive conformal inference with a version that multiplies the indicator by the violation cost. This changes how the threshold updates so that high-severity events pull the margin harder than low-severity ones. The authors then embed the resulting predictor inside a model-free robust control loop.

That weighting step is the concrete novelty. Prior ACI work tracks a single average; here the driving signal is the cost-weighted process. The abstract says this yields simultaneous long-run bounds on violation frequency and on total cost.

The experiments are described only at the abstract level, so their details are not available for assessment. The control integration itself follows the usual pattern of using conformal sets to adjust robustness margins online.

The soft spot is the dual guarantee. Once the adaptation is driven by the weighted loss, the unweighted count is no longer the quantity being regulated. When costs are heterogeneous and time-varying, the threshold can settle at a level that keeps the weighted sum inside its target while the plain frequency exceeds its target. The abstract gives no auxiliary assumption (bounded costs, separate frequency tracking, or similar) that would prevent this drift. Without that, the claim that both bounds hold simultaneously does not go through from the stated construction.

The work is aimed at people already using conformal methods inside control loops. A reader who wants to see how cost weighting changes the adaptation dynamics could extract the idea, but anyone who needs the dual statistical guarantee should verify the proofs before relying on it.

I would send the paper to review because the cost-weighting idea is worth a technical check even if the guarantee needs tightening.

Referee Report

1 major / 0 minor

Summary. The paper proposes Cost-Aware Adaptive Conformal Inference (ACI), which augments standard ACI by replacing the usual miscoverage loss with a cost-weighted version that couples the indicator of constraint violation with the associated violation cost. The central claim is that this yields a dual long-run statistical guarantee: the time-average violation frequency is bounded by a target α while the time-average cumulative violation cost is simultaneously bounded by a target eta. The method is then embedded in a robust control synthesis loop for runtime assurance under unknown time-varying distributions, with experiments demonstrating its use for online risk-aware controller design.

Significance. If the dual guarantee can be established, the contribution would be significant for safety-critical control applications. It would extend conformal prediction beyond single-metric coverage to a setting that penalizes high-severity violations more heavily, allowing the prediction sets (and thus the controller) to respond proportionally to harm rather than only to frequency. The closed-loop integration with control synthesis is a natural and potentially useful direction.

major comments (1)

[Abstract] Abstract (and wherever the dual-guarantee theorem appears): the claim that both (1/n)Σ I_t ≤ α and (1/n)Σ c_t I_t ≤ eta hold simultaneously is load-bearing for the paper’s contribution. Standard ACI adapts a single threshold via a martingale or quantile-tracking argument that directly drives the unweighted miscoverage process to α. Substituting a cost-weighted loss shifts the adaptation signal to the weighted process. When violation costs are heterogeneous and time-varying, the threshold can converge to a value that meets the weighted target while leaving the unweighted frequency above α. The manuscript must either (i) state auxiliary assumptions (e.g., uniformly bounded costs, separate adaptation loops, or a proven invariance) that restore both bounds or (ii) provide an explicit proof that the weighted adaptation still controls the unweighted rate under the paper’s stated conditions.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and substantive comment on the dual statistical guarantee, which is indeed central to the contribution. We address the concern directly below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract (and wherever the dual-guarantee theorem appears): the claim that both (1/n)Σ I_t ≤ α and (1/n)Σ c_t I_t ≤ β hold simultaneously is load-bearing for the paper’s contribution. Standard ACI adapts a single threshold via a martingale or quantile-tracking argument that directly drives the unweighted miscoverage process to α. Substituting a cost-weighted loss shifts the adaptation signal to the weighted process. When violation costs are heterogeneous and time-varying, the threshold can converge to a value that meets the weighted target while leaving the unweighted frequency above α. The manuscript must either (i) state auxiliary assumptions (e.g., uniformly bounded costs, separate adaptation loops, or a proven invariance) that restore both bounds or (ii) provide an explicit proof that the weighted adaptation still controls the unweighted rate under the paper’s stated condi

Authors: We agree that the original presentation did not make the simultaneous control fully explicit. The Cost-Aware ACI update uses a single threshold driven by the cost-weighted loss, and the manuscript's theorem statement claims both long-run bounds without a self-contained argument showing why the unweighted frequency cannot exceed α when costs vary. In the revision we will supply an explicit proof (option (ii)) under the paper's existing conditions: costs are nonnegative and upper-bounded by a known constant C, the adaptation gain satisfies the standard step-size conditions for almost-sure convergence of the weighted process to β, and the indicator I_t is recovered from the weighted term via the bound 0 ≤ c_t I_t ≤ C I_t. This yields the auxiliary inequality that the unweighted average is controlled by (1/C) times the weighted average plus a vanishing term, thereby establishing both guarantees simultaneously without additional assumptions. The proof will be inserted after the main theorem and the abstract wording will be tightened to reference the new argument. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces a cost-aware loss function within the ACI adaptation mechanism and claims that this yields simultaneous long-run bounds on both unweighted violation frequency and cost-weighted harm. No equations, self-citations, or uniqueness theorems are exhibited in the abstract or description that would reduce either guarantee to a fitted parameter or prior result by construction. The adaptation is presented as a direct formal extension of standard ACI, with the dual property asserted to follow from the weighted loss without evident self-referential closure or renaming of known empirical patterns. The derivation chain therefore remains self-contained against external statistical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only; no explicit free parameters, invented entities, or ad-hoc axioms listed. Relies on background assumptions of conformal prediction.

axioms (1)

standard math Conformal prediction provides valid coverage guarantees under suitable assumptions on data exchangeability or stationarity
Implicit foundation for any conformal inference method.

pith-pipeline@v0.9.1-grok · 5716 in / 1016 out tokens · 29556 ms · 2026-06-30T13:17:00.909470+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 5 canonical work pages · 2 internal anchors

[1]

Safe reinforcement learning via shielding

Mohammed Alshiekh, Roderick Bloem, Rüdiger Ehlers, Bettina Könighofer, Scott Niekum, and Ufuk Topcu. Safe reinforcement learning via shielding. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

2018
[2]

Control barrier functions: Theory and applications

Aaron D Ames, Samuel Coogan, Magnus Egerstedt, Gennaro Notomista, Koushil Sreenath, and Paulo Tabuada. Control barrier functions: Theory and applications. In2019 18th European control conference (ECC), pages 3420–3431. Ieee, 2019

2019
[3]

Control barrier function based quadratic programs for safety critical systems.IEEE Transactions on Automatic Control, 62(8):3861–3876, 2016

Aaron D Ames, Xiangru Xu, Jessy W Grizzle, and Paulo Tabuada. Control barrier function based quadratic programs for safety critical systems.IEEE Transactions on Automatic Control, 62(8):3861–3876, 2016

2016
[4]

Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

Joel Andersson, Joris Gillis, Greg Horn, Jim Rawlings, and Moritz Diehl. Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

2018
[5]

Conformal pid control for time series prediction.Advances in neural information processing systems, 36:23047– 23074, 2023

Anastasios Angelopoulos, Emmanuel Candes, and Ryan J Tibshirani. Conformal pid control for time series prediction.Advances in neural information processing systems, 36:23047– 23074, 2023

2023
[6]

Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845, 2023

Rina Foygel Barber, Emmanuel J Candes, Aaditya Ramdas, and Ryan J Tibshirani. Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845, 2023

2023
[7]

Robust adaptive control of feedback linearizable mimo nonlinear systems with prescribed performance.IEEE transactions on Au- tomatic Control, 53(9):2090–2099, 2008

Charalampos P Bechlioulis and George A Rovithakis. Robust adaptive control of feedback linearizable mimo nonlinear systems with prescribed performance.IEEE transactions on Au- tomatic Control, 53(9):2090–2099, 2008

2090
[8]

Conformal quantitative predictive monitoring of stl requirements for stochastic processes

Francesca Cairoli, Nicola Paoletti, Luca Bortolussi, et al. Conformal quantitative predictive monitoring of stl requirements for stochastic processes. InHSCC’23: Proceedings of the 26th 10 ACM International Conference on Hybrid Systems: Computation and Control, volume 1, pages 1–11. ACM, 2023

2023
[9]

Guaranteeing safety of learned perception modules via measurement-robust control barrier functions

Sarah Dean, Andrew Taylor, Ryan Cosner, Benjamin Recht, and Aaron Ames. Guaranteeing safety of learned perception modules via measurement-robust control barrier functions. In Conference on Robot Learning, pages 654–670. PMLR, 2021

2021
[10]

Adaptive conformal prediction for motion planning among dynamic agents

Anushri Dixit, Lars Lindemann, Skylar X Wei, Matthew Cleaveland, George J Pappas, and Joel W Burdick. Adaptive conformal prediction for motion planning among dynamic agents. InLearning for Dynamics and Control Conference, pages 300–314. PMLR, 2023

2023
[11]

Shrinking horizon model predictive control with signal temporal logic constraints under stochastic dis- turbances.IEEE Transactions on Automatic Control, 64(8):3324–3331, 2018

Samira S Farahani, Rupak Majumdar, Vinayak S Prabhu, and Sadegh Soudjani. Shrinking horizon model predictive control with signal temporal logic constraints under stochastic dis- turbances.IEEE Transactions on Automatic Control, 64(8):3324–3331, 2018

2018
[12]

Achieving risk control in online learning settings.Transactions on Machine Learning Research, 2024

Shai Feldman, Liran Ringel, Stephen Bates, and Yaniv Romano. Achieving risk control in online learning settings.Transactions on Machine Learning Research, 2024

2024
[13]

Model predictive control: Theory and practice—a survey.Automatica, 25(3):335–348, 1989

Carlos E Garcia, David M Prett, and Manfred Morari. Model predictive control: Theory and practice—a survey.Automatica, 25(3):335–348, 1989

1989
[14]

Adaptive conformal inference under distribution shift

Isaac Gibbs and Emmanuel Candes. Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34:1660–1672, 2021

2021
[15]

Conformal inference for online prediction with arbitrary distribution shifts.Journal of Machine Learning Research, 25(162):1–36, 2024

Isaac Gibbs and Emmanuel J Candès. Conformal inference for online prediction with arbitrary distribution shifts.Journal of Machine Learning Research, 25(162):1–36, 2024

2024
[16]

Convex computation of the region of attraction of polynomial control systems.IEEE Transactions on Automatic Control, 59(2):297–312, 2013

Didier Henrion and Milan Korda. Convex computation of the region of attraction of polynomial control systems.IEEE Transactions on Automatic Control, 59(2):297–312, 2013

2013
[17]

How to train your robot with deep reinforcement learning: lessons we have learned.The International Journal of Robotics Research, 40(4-5):698–721, 2021

Julian Ibarz, Jie Tan, Chelsea Finn, Mrinal Kalakrishnan, Peter Pastor, and Sergey Levine. How to train your robot with deep reinforcement learning: lessons we have learned.The International Journal of Robotics Research, 40(4-5):698–721, 2021

2021
[18]

Conformal decision theory: Safe autonomous decisions from imperfect predictions

Jordan Lekeufack, Anastasios N Angelopoulos, Andrea Bajcsy, Michael I Jordan, and Jitendra Malik. Conformal decision theory: Safe autonomous decisions from imperfect predictions. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11668– 11675. IEEE, 2024

2024
[19]

Safe planning in dynamic environments using conformal prediction.IEEE Robotics and Automation Letters, 8(8):5116–5123, 2023

Lars Lindemann, Matthew Cleaveland, Gihyun Shim, and George J Pappas. Safe planning in dynamic environments using conformal prediction.IEEE Robotics and Automation Letters, 8(8):5116–5123, 2023

2023
[20]

Control barrier functions for signal temporal logic tasks.IEEE control systems letters, 3(1):96–101, 2018

Lars Lindemann and Dimos V Dimarogonas. Control barrier functions for signal temporal logic tasks.IEEE control systems letters, 3(1):96–101, 2018

2018
[21]

Learning robust output control barrier functions from safe expert demonstrations.IEEE Open Journal of Control Systems, 3:158–172, 2024

Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, and Nikolai Matni. Learning robust output control barrier functions from safe expert demonstrations.IEEE Open Journal of Control Systems, 3:158–172, 2024

2024
[22]

Formal verification and control with conformal prediction: Practical safety guarantees for autonomous systems.IEEE Control Systems, 45(6):72–122, 2025

Lars Lindemann, Yiqi Zhao, Xinyi Yu, George J Pappas, and Jyotirmoy V Deshmukh. Formal verification and control with conformal prediction: Practical safety guarantees for autonomous systems.IEEE Control Systems, 45(6):72–122, 2025

2025
[23]

Lennart Ljung and Torsten Söderström.Theory and practice of recursive identification. 1983

1983
[24]

Predictability: A problem partly solved

Edward N Lorenz. Predictability: A problem partly solved. InProc. Seminar on predictability, volume 1, pages 1–18. Reading, 1996

1996
[25]

Model predictive control: past, present and future.Computers & chemical engineering, 23(4-5):667–682, 1999

Manfred Morari and Jay H Lee. Model predictive control: past, present and future.Computers & chemical engineering, 23(4-5):667–682, 1999

1999
[26]

Adaptive conformal inference by betting

Aleksandr Podkopaev, Darren Xu, and Kuang-chih Lee. Adaptive conformal inference by betting. InProceedings of the 41st International Conference on Machine Learning, pages 40886–40907, 2024. 11

2024
[27]

Learning control barrier functions from expert demonstra- tions

Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V Dimarogonas, Stephen Tu, and Nikolai Matni. Learning control barrier functions from expert demonstra- tions. In2020 59th IEEE Conference on Decision and Control (CDC), pages 3717–3724. Ieee, 2020

2020
[28]

Suboptimal model predictive control (feasibility implies stability).IEEE Transactions on Automatic Control, 44(3):648– 654, 2002

Pierre OM Scokaert, David Q Mayne, and James B Rawlings. Suboptimal model predictive control (feasibility implies stability).IEEE Transactions on Automatic Control, 44(3):648– 654, 2002

2002
[29]

A tutorial on conformal prediction.Journal of Machine Learning Research, 9(3), 2008

Glenn Shafer and Vladimir V ovk. A tutorial on conformal prediction.Journal of Machine Learning Research, 9(3), 2008

2008
[30]

Safe pomdp online planning among dynamic agents via adaptive conformal prediction.IEEE Robotics and Au- tomation Letters, 2024

Shili Sheng, Pian Yu, David Parker, Marta Kwiatkowska, and Lu Feng. Safe pomdp online planning among dynamic agents via adaptive conformal prediction.IEEE Robotics and Au- tomation Letters, 2024

2024
[31]

A general framework for multi-step ahead adaptive conformal heteroscedastic time series forecasting.Neurocomputing, 608:128434, 2024

Martim Sousa, Ana Maria Tomé, and José Moreira. A general framework for multi-step ahead adaptive conformal heteroscedastic time series forecasting.Neurocomputing, 608:128434, 2024

2024
[32]

Synthesis of con- trol barrier functions using a supervised machine learning approach

Mohit Srinivasan, Amogh Dabholkar, Samuel Coogan, and Patricio A Vela. Synthesis of con- trol barrier functions using a supervised machine learning approach. In2020 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS), pages 7139–7145. Ieee, 2020

2020
[33]

Conformal predictive safety filter for rl controllers in dynamic environments.IEEE Robotics and Automation Letters, 8(11):7833– 7840, 2023

Kegan J Strawn, Nora Ayanian, and Lars Lindemann. Conformal predictive safety filter for rl controllers in dynamic environments.IEEE Robotics and Automation Letters, 8(11):7833– 7840, 2023

2023
[34]

Adaptive conformal inference for multi-step ahead time-series forecasting online.arXiv preprint arXiv:2409.14792, 2024

Johan Hallberg Szabadváry. Adaptive conformal inference for multi-step ahead time-series forecasting online.arXiv preprint arXiv:2409.14792, 2024

work page arXiv 2024
[35]

Learning for safety-critical control with control barrier functions

Andrew Taylor, Andrew Singletary, Yisong Yue, and Aaron Ames. Learning for safety-critical control with control barrier functions. InLearning for dynamics and control, pages 708–717. PMLR, 2020

2020
[36]

Recovery rl: Safe reinforcement learning with learned recovery zones.IEEE Robotics and Automation Letters, 6(3):4915–4922, 2021

Brijen Thananjeyan, Ashwin Balakrishna, Suraj Nair, Michael Luo, Krishnan Srinivasan, Minho Hwang, Joseph E Gonzalez, Julian Ibarz, Chelsea Finn, and Ken Goldberg. Recovery rl: Safe reinforcement learning with learned recovery zones.IEEE Robotics and Automation Letters, 6(3):4915–4922, 2021

2021
[37]

Conformal prediction under covariate shift.Advances in neural information processing systems, 32, 2019

Ryan J Tibshirani, Rina Foygel Barber, Emmanuel Candes, and Aaditya Ramdas. Conformal prediction under covariate shift.Advances in neural information processing systems, 32, 2019

2019
[38]

Behavioral Cloning from Observation

Faraz Torabi, Garrett Warnell, and Peter Stone. Behavioral cloning from observation.arXiv preprint arXiv:1805.01954, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[39]

Gymnasium: A Standard Interface for Reinforcement Learning Environments

Mark Towers, Ariel Kwiatkowski, Jordan Terry, John U Balis, Gianluca De Cola, Tristan Deleu, Manuel Goulão, Andreas Kallinteris, Markus Krimmel, Arjun KG, et al. Gym- nasium: A standard interface for reinforcement learning environments.arXiv preprint arXiv:2407.17032, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[40]

Springer, 2005

Vladimir V ovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world. Springer, 2005

2005
[41]

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

Andreas Wächter and Lorenz T Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

2006
[42]

Bellman conformal inference: Calibrating prediction intervals for time series.arXiv preprint arXiv:2402.05203, 2024

Zitong Yang, Emmanuel Candès, and Lihua Lei. Bellman conformal inference: Calibrating prediction intervals for time series.arXiv preprint arXiv:2402.05203, 2024

work page arXiv 2024
[43]

Sonic: Safe social navigation with adaptive conformal inference and constrained reinforcement learning.arXiv preprint arXiv:2407.17460, 2024

Jianpeng Yao, Xiaopan Zhang, Yu Xia, Zejin Wang, Amit K Roy-Chowdhury, and Jiachen Li. Sonic: Safe social navigation with adaptive conformal inference and constrained reinforcement learning.arXiv preprint arXiv:2407.17460, 2024. 12

work page arXiv 2024
[44]

Adaptive conformal predictions for time series

Margaux Zaffran, Olivier Féron, Yannig Goude, Julie Josse, and Aymeric Dieuleveut. Adaptive conformal predictions for time series. InInternational Conference on Machine Learning, pages 25834–25866. PMLR, 2022

2022
[45]

Safety-critical control with uncertainty quantifica- tion using adaptive conformal prediction

Hao Zhou, Yanze Zhang, and Wenhao Luo. Safety-critical control with uncertainty quantifica- tion using adaptive conformal prediction. In2024 American Control Conference (ACC), pages 574–580. IEEE, 2024

2024
[46]

Online convex programming and generalized infinitesimal gradient ascent

Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. InProceedings of the 20th international conference on machine learning (icml-03), pages 928– 936, 2003. 13 Appendix A Proof A.1 Proof of Lemma 1 Lemma 1(Parameter Boundedness).Let{δ k}be generated by the update rule(7). Under Assump- tions 3 and 4, ifδ 1 is initiali...

2003
[47]

By Assumption 4, the controller enforcesbh(xk0−2,u k0−2)≥M, which impliess k0−1 ≤M= ˆQk0−1, leading toe k0−1 = 0andL k0−1 = 0

Ifδ k0−1 <0, then by definition ˆQk0−1(δk0−1) =M. By Assumption 4, the controller enforcesbh(xk0−2,u k0−2)≥M, which impliess k0−1 ≤M= ˆQk0−1, leading toe k0−1 = 0andL k0−1 = 0. This contradictsL k0−1 > α
[48]

Upper bound:The argument follows symmetrically

Ifδ k0−1 ≥0, the minimum possibleδ k0 is0+γ(α−L max) =−γ(L max −α), establishing the contradiction. Upper bound:The argument follows symmetrically. Ifδ k0 >1 +γα, consider the minimal such k0. Forδ k0−1 >1, we have ˆQk0−1 =−ϵ(since1−δ k0−1 <0), forcinge k0 = 1andL k0 >1> α, which decreasesδ k0, a contradiction. Forδ k0−1 ≤1, the maximum possibleδ k0 is1 +...

2000

[1] [1]

Safe reinforcement learning via shielding

Mohammed Alshiekh, Roderick Bloem, Rüdiger Ehlers, Bettina Könighofer, Scott Niekum, and Ufuk Topcu. Safe reinforcement learning via shielding. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

2018

[2] [2]

Control barrier functions: Theory and applications

Aaron D Ames, Samuel Coogan, Magnus Egerstedt, Gennaro Notomista, Koushil Sreenath, and Paulo Tabuada. Control barrier functions: Theory and applications. In2019 18th European control conference (ECC), pages 3420–3431. Ieee, 2019

2019

[3] [3]

Control barrier function based quadratic programs for safety critical systems.IEEE Transactions on Automatic Control, 62(8):3861–3876, 2016

Aaron D Ames, Xiangru Xu, Jessy W Grizzle, and Paulo Tabuada. Control barrier function based quadratic programs for safety critical systems.IEEE Transactions on Automatic Control, 62(8):3861–3876, 2016

2016

[4] [4]

Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

Joel Andersson, Joris Gillis, Greg Horn, Jim Rawlings, and Moritz Diehl. Casadi—a soft- ware framework for nonlinear optimization and optimal control.Mathematical Programming Computation, 11(1):1–36, 2018

2018

[5] [5]

Conformal pid control for time series prediction.Advances in neural information processing systems, 36:23047– 23074, 2023

Anastasios Angelopoulos, Emmanuel Candes, and Ryan J Tibshirani. Conformal pid control for time series prediction.Advances in neural information processing systems, 36:23047– 23074, 2023

2023

[6] [6]

Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845, 2023

Rina Foygel Barber, Emmanuel J Candes, Aaditya Ramdas, and Ryan J Tibshirani. Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845, 2023

2023

[7] [7]

Robust adaptive control of feedback linearizable mimo nonlinear systems with prescribed performance.IEEE transactions on Au- tomatic Control, 53(9):2090–2099, 2008

Charalampos P Bechlioulis and George A Rovithakis. Robust adaptive control of feedback linearizable mimo nonlinear systems with prescribed performance.IEEE transactions on Au- tomatic Control, 53(9):2090–2099, 2008

2090

[8] [8]

Conformal quantitative predictive monitoring of stl requirements for stochastic processes

Francesca Cairoli, Nicola Paoletti, Luca Bortolussi, et al. Conformal quantitative predictive monitoring of stl requirements for stochastic processes. InHSCC’23: Proceedings of the 26th 10 ACM International Conference on Hybrid Systems: Computation and Control, volume 1, pages 1–11. ACM, 2023

2023

[9] [9]

Guaranteeing safety of learned perception modules via measurement-robust control barrier functions

Sarah Dean, Andrew Taylor, Ryan Cosner, Benjamin Recht, and Aaron Ames. Guaranteeing safety of learned perception modules via measurement-robust control barrier functions. In Conference on Robot Learning, pages 654–670. PMLR, 2021

2021

[10] [10]

Adaptive conformal prediction for motion planning among dynamic agents

Anushri Dixit, Lars Lindemann, Skylar X Wei, Matthew Cleaveland, George J Pappas, and Joel W Burdick. Adaptive conformal prediction for motion planning among dynamic agents. InLearning for Dynamics and Control Conference, pages 300–314. PMLR, 2023

2023

[11] [11]

Shrinking horizon model predictive control with signal temporal logic constraints under stochastic dis- turbances.IEEE Transactions on Automatic Control, 64(8):3324–3331, 2018

Samira S Farahani, Rupak Majumdar, Vinayak S Prabhu, and Sadegh Soudjani. Shrinking horizon model predictive control with signal temporal logic constraints under stochastic dis- turbances.IEEE Transactions on Automatic Control, 64(8):3324–3331, 2018

2018

[12] [12]

Achieving risk control in online learning settings.Transactions on Machine Learning Research, 2024

Shai Feldman, Liran Ringel, Stephen Bates, and Yaniv Romano. Achieving risk control in online learning settings.Transactions on Machine Learning Research, 2024

2024

[13] [13]

Model predictive control: Theory and practice—a survey.Automatica, 25(3):335–348, 1989

Carlos E Garcia, David M Prett, and Manfred Morari. Model predictive control: Theory and practice—a survey.Automatica, 25(3):335–348, 1989

1989

[14] [14]

Adaptive conformal inference under distribution shift

Isaac Gibbs and Emmanuel Candes. Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34:1660–1672, 2021

2021

[15] [15]

Conformal inference for online prediction with arbitrary distribution shifts.Journal of Machine Learning Research, 25(162):1–36, 2024

Isaac Gibbs and Emmanuel J Candès. Conformal inference for online prediction with arbitrary distribution shifts.Journal of Machine Learning Research, 25(162):1–36, 2024

2024

[16] [16]

Convex computation of the region of attraction of polynomial control systems.IEEE Transactions on Automatic Control, 59(2):297–312, 2013

Didier Henrion and Milan Korda. Convex computation of the region of attraction of polynomial control systems.IEEE Transactions on Automatic Control, 59(2):297–312, 2013

2013

[17] [17]

How to train your robot with deep reinforcement learning: lessons we have learned.The International Journal of Robotics Research, 40(4-5):698–721, 2021

Julian Ibarz, Jie Tan, Chelsea Finn, Mrinal Kalakrishnan, Peter Pastor, and Sergey Levine. How to train your robot with deep reinforcement learning: lessons we have learned.The International Journal of Robotics Research, 40(4-5):698–721, 2021

2021

[18] [18]

Conformal decision theory: Safe autonomous decisions from imperfect predictions

Jordan Lekeufack, Anastasios N Angelopoulos, Andrea Bajcsy, Michael I Jordan, and Jitendra Malik. Conformal decision theory: Safe autonomous decisions from imperfect predictions. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11668– 11675. IEEE, 2024

2024

[19] [19]

Safe planning in dynamic environments using conformal prediction.IEEE Robotics and Automation Letters, 8(8):5116–5123, 2023

Lars Lindemann, Matthew Cleaveland, Gihyun Shim, and George J Pappas. Safe planning in dynamic environments using conformal prediction.IEEE Robotics and Automation Letters, 8(8):5116–5123, 2023

2023

[20] [20]

Control barrier functions for signal temporal logic tasks.IEEE control systems letters, 3(1):96–101, 2018

Lars Lindemann and Dimos V Dimarogonas. Control barrier functions for signal temporal logic tasks.IEEE control systems letters, 3(1):96–101, 2018

2018

[21] [21]

Learning robust output control barrier functions from safe expert demonstrations.IEEE Open Journal of Control Systems, 3:158–172, 2024

Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, and Nikolai Matni. Learning robust output control barrier functions from safe expert demonstrations.IEEE Open Journal of Control Systems, 3:158–172, 2024

2024

[22] [22]

Formal verification and control with conformal prediction: Practical safety guarantees for autonomous systems.IEEE Control Systems, 45(6):72–122, 2025

Lars Lindemann, Yiqi Zhao, Xinyi Yu, George J Pappas, and Jyotirmoy V Deshmukh. Formal verification and control with conformal prediction: Practical safety guarantees for autonomous systems.IEEE Control Systems, 45(6):72–122, 2025

2025

[23] [23]

Lennart Ljung and Torsten Söderström.Theory and practice of recursive identification. 1983

1983

[24] [24]

Predictability: A problem partly solved

Edward N Lorenz. Predictability: A problem partly solved. InProc. Seminar on predictability, volume 1, pages 1–18. Reading, 1996

1996

[25] [25]

Model predictive control: past, present and future.Computers & chemical engineering, 23(4-5):667–682, 1999

Manfred Morari and Jay H Lee. Model predictive control: past, present and future.Computers & chemical engineering, 23(4-5):667–682, 1999

1999

[26] [26]

Adaptive conformal inference by betting

Aleksandr Podkopaev, Darren Xu, and Kuang-chih Lee. Adaptive conformal inference by betting. InProceedings of the 41st International Conference on Machine Learning, pages 40886–40907, 2024. 11

2024

[27] [27]

Learning control barrier functions from expert demonstra- tions

Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V Dimarogonas, Stephen Tu, and Nikolai Matni. Learning control barrier functions from expert demonstra- tions. In2020 59th IEEE Conference on Decision and Control (CDC), pages 3717–3724. Ieee, 2020

2020

[28] [28]

Suboptimal model predictive control (feasibility implies stability).IEEE Transactions on Automatic Control, 44(3):648– 654, 2002

Pierre OM Scokaert, David Q Mayne, and James B Rawlings. Suboptimal model predictive control (feasibility implies stability).IEEE Transactions on Automatic Control, 44(3):648– 654, 2002

2002

[29] [29]

A tutorial on conformal prediction.Journal of Machine Learning Research, 9(3), 2008

Glenn Shafer and Vladimir V ovk. A tutorial on conformal prediction.Journal of Machine Learning Research, 9(3), 2008

2008

[30] [30]

Safe pomdp online planning among dynamic agents via adaptive conformal prediction.IEEE Robotics and Au- tomation Letters, 2024

Shili Sheng, Pian Yu, David Parker, Marta Kwiatkowska, and Lu Feng. Safe pomdp online planning among dynamic agents via adaptive conformal prediction.IEEE Robotics and Au- tomation Letters, 2024

2024

[31] [31]

A general framework for multi-step ahead adaptive conformal heteroscedastic time series forecasting.Neurocomputing, 608:128434, 2024

Martim Sousa, Ana Maria Tomé, and José Moreira. A general framework for multi-step ahead adaptive conformal heteroscedastic time series forecasting.Neurocomputing, 608:128434, 2024

2024

[32] [32]

Synthesis of con- trol barrier functions using a supervised machine learning approach

Mohit Srinivasan, Amogh Dabholkar, Samuel Coogan, and Patricio A Vela. Synthesis of con- trol barrier functions using a supervised machine learning approach. In2020 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS), pages 7139–7145. Ieee, 2020

2020

[33] [33]

Conformal predictive safety filter for rl controllers in dynamic environments.IEEE Robotics and Automation Letters, 8(11):7833– 7840, 2023

Kegan J Strawn, Nora Ayanian, and Lars Lindemann. Conformal predictive safety filter for rl controllers in dynamic environments.IEEE Robotics and Automation Letters, 8(11):7833– 7840, 2023

2023

[34] [34]

Adaptive conformal inference for multi-step ahead time-series forecasting online.arXiv preprint arXiv:2409.14792, 2024

Johan Hallberg Szabadváry. Adaptive conformal inference for multi-step ahead time-series forecasting online.arXiv preprint arXiv:2409.14792, 2024

work page arXiv 2024

[35] [35]

Learning for safety-critical control with control barrier functions

Andrew Taylor, Andrew Singletary, Yisong Yue, and Aaron Ames. Learning for safety-critical control with control barrier functions. InLearning for dynamics and control, pages 708–717. PMLR, 2020

2020

[36] [36]

Recovery rl: Safe reinforcement learning with learned recovery zones.IEEE Robotics and Automation Letters, 6(3):4915–4922, 2021

Brijen Thananjeyan, Ashwin Balakrishna, Suraj Nair, Michael Luo, Krishnan Srinivasan, Minho Hwang, Joseph E Gonzalez, Julian Ibarz, Chelsea Finn, and Ken Goldberg. Recovery rl: Safe reinforcement learning with learned recovery zones.IEEE Robotics and Automation Letters, 6(3):4915–4922, 2021

2021

[37] [37]

Conformal prediction under covariate shift.Advances in neural information processing systems, 32, 2019

Ryan J Tibshirani, Rina Foygel Barber, Emmanuel Candes, and Aaditya Ramdas. Conformal prediction under covariate shift.Advances in neural information processing systems, 32, 2019

2019

[38] [38]

Behavioral Cloning from Observation

Faraz Torabi, Garrett Warnell, and Peter Stone. Behavioral cloning from observation.arXiv preprint arXiv:1805.01954, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[39] [39]

Gymnasium: A Standard Interface for Reinforcement Learning Environments

Mark Towers, Ariel Kwiatkowski, Jordan Terry, John U Balis, Gianluca De Cola, Tristan Deleu, Manuel Goulão, Andreas Kallinteris, Markus Krimmel, Arjun KG, et al. Gym- nasium: A standard interface for reinforcement learning environments.arXiv preprint arXiv:2407.17032, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[40] [40]

Springer, 2005

Vladimir V ovk, Alexander Gammerman, and Glenn Shafer.Algorithmic learning in a random world. Springer, 2005

2005

[41] [41]

On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

Andreas Wächter and Lorenz T Biegler. On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.Mathematical programming, 106(1):25–57, 2006

2006

[42] [42]

Bellman conformal inference: Calibrating prediction intervals for time series.arXiv preprint arXiv:2402.05203, 2024

Zitong Yang, Emmanuel Candès, and Lihua Lei. Bellman conformal inference: Calibrating prediction intervals for time series.arXiv preprint arXiv:2402.05203, 2024

work page arXiv 2024

[43] [43]

Sonic: Safe social navigation with adaptive conformal inference and constrained reinforcement learning.arXiv preprint arXiv:2407.17460, 2024

Jianpeng Yao, Xiaopan Zhang, Yu Xia, Zejin Wang, Amit K Roy-Chowdhury, and Jiachen Li. Sonic: Safe social navigation with adaptive conformal inference and constrained reinforcement learning.arXiv preprint arXiv:2407.17460, 2024. 12

work page arXiv 2024

[44] [44]

Adaptive conformal predictions for time series

Margaux Zaffran, Olivier Féron, Yannig Goude, Julie Josse, and Aymeric Dieuleveut. Adaptive conformal predictions for time series. InInternational Conference on Machine Learning, pages 25834–25866. PMLR, 2022

2022

[45] [45]

Safety-critical control with uncertainty quantifica- tion using adaptive conformal prediction

Hao Zhou, Yanze Zhang, and Wenhao Luo. Safety-critical control with uncertainty quantifica- tion using adaptive conformal prediction. In2024 American Control Conference (ACC), pages 574–580. IEEE, 2024

2024

[46] [46]

Online convex programming and generalized infinitesimal gradient ascent

Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. InProceedings of the 20th international conference on machine learning (icml-03), pages 928– 936, 2003. 13 Appendix A Proof A.1 Proof of Lemma 1 Lemma 1(Parameter Boundedness).Let{δ k}be generated by the update rule(7). Under Assump- tions 3 and 4, ifδ 1 is initiali...

2003

[47] [47]

By Assumption 4, the controller enforcesbh(xk0−2,u k0−2)≥M, which impliess k0−1 ≤M= ˆQk0−1, leading toe k0−1 = 0andL k0−1 = 0

Ifδ k0−1 <0, then by definition ˆQk0−1(δk0−1) =M. By Assumption 4, the controller enforcesbh(xk0−2,u k0−2)≥M, which impliess k0−1 ≤M= ˆQk0−1, leading toe k0−1 = 0andL k0−1 = 0. This contradictsL k0−1 > α

[48] [48]

Upper bound:The argument follows symmetrically

Ifδ k0−1 ≥0, the minimum possibleδ k0 is0+γ(α−L max) =−γ(L max −α), establishing the contradiction. Upper bound:The argument follows symmetrically. Ifδ k0 >1 +γα, consider the minimal such k0. Forδ k0−1 >1, we have ˆQk0−1 =−ϵ(since1−δ k0−1 <0), forcinge k0 = 1andL k0 >1> α, which decreasesδ k0, a contradiction. Forδ k0−1 ≤1, the maximum possibleδ k0 is1 +...

2000