arxiv: 2605.13028 · v1 · submitted 2026-05-13 · 💻 cs.RO · cs.SY· eess.SY

Recognition: unknown

Local Conformal Calibration of Dynamics Uncertainty from Semantic Images

Lu\'is Marques , Dmitry Berenson

Authors on Pith no claims yet

Pith reviewed 2026-05-14 19:13 UTC · model grok-4.3

classification 💻 cs.RO cs.SYeess.SY

keywords conformal predictionuncertainty quantificationrobot dynamicssemantic imagescalibrationsafe planningepistemic uncertaintyaleatoric uncertainty

0 comments

The pith

OCULAR uses semantic images to calibrate any linear Gaussian dynamics model and guarantee that future states fall inside user-chosen probability regions in unseen environments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces OCULAR, which groups calibration data from environments whose semantic images look alike and uses that grouping to adjust uncertainty bounds on a given linear Gaussian dynamics model. The resulting prediction regions are guaranteed to contain the true next state with probability at least 1 minus a user-chosen value, even when the model is imperfect and the test environment differs from the calibration data. A sympathetic reader would care because the guarantees are non-asymptotic and distribution-free, so they apply directly to robot planning without requiring exact environment matches or strong assumptions on the real dynamics. The method also flags which observation-velocity-action triples produce larger uncertainty, supporting safer action selection.

Core claim

OCULAR performs local conformal calibration by matching semantic images to select relevant calibration trajectories, then produces prediction sets around the nominal next-state mean whose size accounts for both process noise and model mismatch; these sets contain the true future state with probability at least 1-epsilon for any user epsilon, regardless of the fidelity of the supplied linear Gaussian model.

What carries the argument

Local conformal calibration that selects calibration samples via semantic-image similarity to adjust next-state prediction intervals.

If this is right

The calibrated uncertainty can be used inside a planner to avoid actions whose observation-velocity-action inputs produce large prediction regions.
Guarantees remain valid for out-of-distribution test scenes provided the scenes share semantic appearance with the calibration data.
The procedure works for dynamics models of arbitrary fidelity, from coarse to highly accurate linear Gaussian approximations.
It produces smaller average prediction volumes than methods that must collect data in the exact target environment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Perception similarity can act as a practical proxy for dynamics similarity when collecting calibration data for new settings.
The same grouping idea could be tested on real robots to check whether camera-based calibration transfers across physical hardware changes.
The distinction between high- and low-uncertainty inputs might be combined with reachability analysis to produce explicit safety certificates.

Load-bearing premise

Data from environments whose semantic images are similar can be used to calibrate the linear Gaussian model so that the resulting bounds remain valid in any new test environment that is also visually similar.

What would settle it

Apply OCULAR to a test environment whose camera images match the calibration set but whose true next-state distribution lies outside the computed prediction regions more often than the target probability.

Figures

Figures reproduced from arXiv: 2605.13028 by Dmitry Berenson, Lu\'is Marques.

**Figure 1.** Figure 1: Offline component of OCULAR. 1: ot from D part cal are projected into a planar footprint o ′ t by φ. A CAE is trained to reconstruct o ′ t , and the decoder is discarded. 2: All data in Dcal is processed by process() into a learned representation Xi , and nonconformity scores Ri are computed. 3: a Decision Tree is trained on D part cal to partition the learned input space X into regions of approximately … view at source ↗

**Figure 2.** Figure 2: Online component of OCULAR. Given an estimated Gaussian at time t, a desired action at, and observation ot, we create an approximate next-step Gaussian N˜ t+1 via the approximate model ˜f. The current-time information Xraw i is processed, and the learned representation Xi passed to the Decision Tree. The resulting leaf node Xk has an associated qˆk which is multiplied by a fixed constant to get ξk. The app… view at source ↗

**Figure 3.** Figure 3: Rollouts in map U of the planar robot (see App. D.1 for all maps). The [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: Rollouts in icyMain of Isaac Sim experiment (see App. D.2 for other [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

**Figure 5.** Figure 5: Trajectories of OCULAR and baselines in the planar environment, across 30 runs for each map-method combo. The uncalibrated baseline gains too much momentum in the low-friction region, leading to collisions. SplitCP is too conservative, with all its sampled plans being in collision. This results in goal-chasing behavior and an inability to avoid obstacles. LUCCa and OCULAR have comparable performance, slow… view at source ↗

**Figure 6.** Figure 6: The three tested Isaac Sim environments (based off Rivermark). [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

**Figure 7.** Figure 7: Example observation and observed map in the Isaac Sim environment. [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of all methods in the Isaac environment. [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗

read the original abstract

We introduce Observation-aware Conformal Uncertainty Local-Calibration (OCULAR), a conformal prediction-based algorithm that uses perception information to provide uncertainty quantification guarantees for unseen test-time environments. While previous conformal approaches lack the ability to discriminate between state-action space regions leading to higher or lower model mismatch, and require environment-specific data, our method uses data collected from visually similar environments to provably calibrate a given linear Gaussian dynamics model of arbitrary fidelity. The prediction regions generated from OCULAR are guaranteed to contain the future system states with, at least, a user-set likelihood, despite both aleatoric and epistemic uncertainty -- i.e., uncertainty arising from both stochastic disturbances and lack of data. Our guarantees are non-asymptotic and distribution-free, not requiring strong assumptions about the unknown real system dynamics. Our calibration procedure enables distinguishing between observation-velocity-action inputs leading to higher and lower next-state-uncertainty, which is helpful for probabilistically-safe planning. We numerically validate our algorithm on a double-integrator system subject to random perturbations and significant model mismatch, using both a simplified sensor and a more realistic simulated camera. Our approach appropriately quantifies uncertainty both when in-distribution and out-of-distribution, being comparatively volume-efficient to baselines requiring environment-specific data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

OCULAR tries to make conformal prediction practical for robot dynamics by using semantic images to pick calibration data from visually similar environments, but the exchangeability step for unseen test cases is the part that needs the most scrutiny.

read the letter

The paper's core move is to take a linear-Gaussian dynamics model and locally calibrate conformal sets with data drawn from environments that look similar in semantic images. This lets the method produce smaller prediction regions in low-mismatch parts of state-action space and larger ones where mismatch is higher, all without collecting data in the exact test environment. The double-integrator experiments with added noise and model mismatch show the sets stay smaller than the environment-specific baselines while still hitting the target coverage rate in both in-distribution and out-of-distribution cases. That practical distinction between high- and low-uncertainty regions is the part that could actually help planning algorithms avoid overly conservative actions.

Referee Report

2 major / 2 minor

Summary. The paper introduces OCULAR, a conformal-prediction algorithm that uses semantic images to identify visually similar environments and locally calibrate a linear-Gaussian dynamics model of arbitrary fidelity. It claims non-asymptotic, distribution-free marginal coverage guarantees for future states under both aleatoric and epistemic uncertainty, without requiring environment-specific calibration data, and demonstrates the approach on a double-integrator subject to perturbations and model mismatch using both simplified and camera-based sensors.

Significance. If the coverage transfer via semantic similarity can be rigorously established, the result would enable practical uncertainty quantification for robotics in novel environments where collecting matched calibration data is costly or impossible, while also providing spatially varying uncertainty estimates useful for probabilistically safe planning.

major comments (2)

[§3 (theoretical guarantees)] The central coverage claim rests on exchangeability between the semantically selected calibration set and the test points. The manuscript does not derive that perceptual similarity (via semantic images) implies the required exchangeability or bounds the total-variation distance between the induced residual distributions; without this step the non-asymptotic guarantee does not transfer to unseen environments.
[§5 (numerical experiments)] In the numerical validation on the double-integrator, the reported coverage is shown for both in-distribution and out-of-distribution cases, yet the paper does not quantify how the semantic-similarity threshold affects the empirical coverage gap or the volume of the resulting prediction sets; this leaves open whether the method remains volume-efficient when the visual similarity metric is imperfect.

minor comments (2)

[§2–3] Notation for the conformal score and the local calibration set should be introduced with a single consistent symbol table rather than being redefined inline in multiple sections.
[§5] Figure captions for the camera-based sensor experiments should explicitly state the semantic similarity metric and the number of calibration environments used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. The feedback highlights opportunities to strengthen the presentation of the theoretical assumptions and to expand the experimental sensitivity analysis. We address each major comment below and indicate the planned revisions.

read point-by-point responses

Referee: [§3 (theoretical guarantees)] The central coverage claim rests on exchangeability between the semantically selected calibration set and the test points. The manuscript does not derive that perceptual similarity (via semantic images) implies the required exchangeability or bounds the total-variation distance between the induced residual distributions; without this step the non-asymptotic guarantee does not transfer to unseen environments.

Authors: We agree that an explicit statement of the exchangeability assumption is needed for clarity. In the revised manuscript we will insert a dedicated paragraph in §3 that states the coverage guarantee holds conditionally on the semantic similarity selection, under the modeling assumption that points sharing the same semantic class are exchangeable. This is the standard localization assumption in conformal methods and preserves the distribution-free, non-asymptotic character of the result. We deliberately avoid total-variation bounds because they would require parametric assumptions on the residual distributions, contradicting the paper’s goal of distribution-free guarantees. The selection procedure itself is deterministic given the images, so the marginal coverage statement remains valid over the joint distribution of calibration and test points that pass the similarity filter. revision: partial
Referee: [§5 (numerical experiments)] In the numerical validation on the double-integrator, the reported coverage is shown for both in-distribution and out-of-distribution cases, yet the paper does not quantify how the semantic-similarity threshold affects the empirical coverage gap or the volume of the resulting prediction sets; this leaves open whether the method remains volume-efficient when the visual similarity metric is imperfect.

Authors: We thank the referee for this observation. In the revised §5 we will add two new figures that sweep the similarity threshold over a range of values and plot (i) the empirical coverage gap relative to the nominal level and (ii) the average volume of the prediction sets, separately for the simplified-sensor and camera-based experiments. These plots will include both in-distribution and out-of-distribution test conditions and will demonstrate that coverage remains within a small additive gap of the target while prediction-set volume grows gracefully as the threshold is relaxed, confirming volume efficiency even under imperfect similarity. revision: yes

Circularity Check

0 steps flagged

OCULAR applies standard conformal prediction to semantically similar environments; guarantees derive from exchangeability without reduction to fitted inputs or self-citations

full rationale

The derivation relies on established conformal prediction results for non-asymptotic distribution-free coverage under exchangeability, applied to calibration data selected by semantic similarity. No equations define the prediction regions in terms of the target test quantities themselves, nor do any load-bearing steps reduce to parameters fitted from the same data or to self-citations whose validity depends on the present work. The linear-Gaussian model calibration and local uncertainty distinction follow directly from the conformal quantile construction on the selected sets, preserving independence from the test distribution. This is self-contained against external benchmarks of conformal prediction theory.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Relies on standard conformal prediction exchangeability assumptions and the domain assumption that visual similarity in semantic images correlates with dynamics similarity sufficient for calibration.

axioms (2)

standard math Conformal prediction provides valid coverage under exchangeability of calibration and test points
Core property invoked for the non-asymptotic guarantees.
domain assumption Visually similar environments provide data that can calibrate the linear Gaussian model for the target environment
Central premise allowing use of perception data instead of environment-specific collection.

pith-pipeline@v0.9.0 · 5515 in / 1258 out tokens · 44173 ms · 2026-05-14T19:13:22.012873+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 2 internal anchors

[1]

In: CDC (2021)

Agrawal, D.R., Panagou, D.: Safe control synthesis via input constrained control barrier functions. In: CDC (2021)

work page 2021
[2]

In: 2019 American Control Conference (ACC)

Alsterda, J.P., Brown, M., Gerdes, J.C.: Contingency model predictive control for automated vehicles. In: 2019 American Control Conference (ACC). pp. 717–722 (2019).https://doi.org/10.23919/ACC.2019.8815260

work page doi:10.23919/acc.2019.8815260 2019
[3]

In: 2019 18th European control conference (ECC)

Ames, A.D., Coogan, S., Egerstedt, M., Notomista, G., Sreenath, K., Tabuada, P.: Control barrier functions: Theory and applications. In: 2019 18th European control conference (ECC). pp. 3420–3431. Ieee (2019)

work page 2019
[4]

Angelopoulos, A.N., Barber, R.F., Bates, S.: Theoretical foundations of conformal prediction (2025),https://arxiv.org/abs/2411.11824

work page arXiv 2025
[5]

In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC)

Bansal, S., Chen, M., Herbert, S., Tomlin, C.J.: Hamilton-jacobi reachability: A brief overview and recent advances. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC). pp. 2242–2253. IEEE (2017)

work page 2017
[6]

IEEE Transactions on Robotics27(6), 1080–1094 (2011)

Blackmore, L., Ono, M., Williams, B.C.: Chance-constrained optimal path plan- ning with obstacles. IEEE Transactions on Robotics27(6), 1080–1094 (2011). https://doi.org/10.1109/TRO.2011.2161160

work page doi:10.1109/tro.2011.2161160 2011
[7]

Routledge (2017)

Breiman, L.: Classification and Regression Trees. Routledge (2017)

work page 2017
[8]

Information Sciences686, 121369 (2025)

Cabezas, L.M., Otto, M.P., Izbicki, R., Stern, R.B.: Regression trees for fast and adaptive prediction intervals. Information Sciences686, 121369 (2025)

work page 2025
[9]

RSS (2023)

Cosner, R.K., Culbertson, P., Taylor, A.J., Ames, A.D.: Robust safety under stochastic uncertainty with discrete-time control barrier functions. RSS (2023)

work page 2023
[10]

In: Matni, N., Morari, M., Pappas, G.J

Dixit, A., Lindemann, L., Wei, S.X., Cleaveland, M., Pappas, G.J., Burdick, J.W.: Adaptive conformal prediction for motion planning among dynamic agents. In: Matni, N., Morari, M., Pappas, G.J. (eds.) Proceedings of The 5th Annual Learn- ing for Dynamics and Control Conference. Proceedings of Machine Learning Re- search, vol. 211, pp. 300–314. PMLR (15–16...

work page 2023
[11]

Ad- vances in Neural Information Processing Systems34, 1660–1672 (2021)

Gibbs, I., Candes, E.: Adaptive conformal inference under distribution shift. Ad- vances in Neural Information Processing Systems34, 1660–1672 (2021)

work page 2021
[12]

Hartley, R.: Multiple view geometry in computer vision, vol. 665. Cambridge uni- versity press (2003)

work page 2003
[13]

In: ICRA (2021)

Khan, M., Ibuki, T., Chatterjee, A.: Safety uncertainty in control barrier functions using gaussian processes. In: ICRA (2021)

work page 2021
[14]

arXiv preprint arXiv:2005.06095 (2020)

Kuchibhotla, A.K.: Exchangeability, conformal prediction, and rank tests. arXiv preprint arXiv:2005.06095 (2020)

work page arXiv 2005
[15]

Cambridge university press (2006)

LaValle, S.M.: Planning algorithms. Cambridge university press (2006)

work page 2006
[16]

Journal of the Royal Statistical Society Series B: Statistical Methodology 76(1), 71–96 (2014)

Lei, J., Wasserman, L.: Distribution-free prediction bands for non-parametric re- gression. Journal of the Royal Statistical Society Series B: Statistical Methodology 76(1), 71–96 (2014)

work page 2014
[17]

IEEE Robotics and Automation Letters 8(8), 5116–5123 (2023)

Lindemann, L., Cleaveland, M., Shim, G., Pappas, G.J.: Safe planning in dynamic environments using conformal prediction. IEEE Robotics and Automation Letters 8(8), 5116–5123 (2023)

work page 2023
[18]

IJRR (2024)

Luo, R., Zhao, S., Kuck, J., Ivanovic, B., Savarese, S., Schmerling, E., Pavone, M.: Sample-efficient safety assurances using conformal prediction. IJRR (2024)

work page 2024
[19]

In: Algorithmic Foundations of Robotics XVI, Volume 2

Marques, L., Berenson, D.: Quantifying aleatoric and epistemic dynamics uncer- tainty via local conformal calibration. In: Algorithmic Foundations of Robotics XVI, Volume 2. pp. 85–103. Springer Nature Switzerland, Cham (2026) 18 L. Marques, and D. Berenson

work page 2026
[20]

IEEE Robotics and Automation Letters pp

Marques, L., Ghaffari, M., Berenson, D.: Lies we can trust: Quantifying ac- tion uncertainty with inaccurate stochastic dynamics through conformalized non- holonomic lie groups. IEEE Robotics and Automation Letters pp. 1–8 (2026). https://doi.org/10.1109/LRA.2026.3656773

work page doi:10.1109/lra.2026.3656773 2026
[21]

The International Journal of Robotics Research p

Mei, Z., Dixit, A., Booker, M., Zhou, E., Storey-Matsutani, M., Ren, A.Z., Shorinwa, O., Majumdar, A.: Perceive with confidence: Statistical safety assur- ances for navigation with learning-based perception. The International Journal of Robotics Research p. 02783649251378151 (2024)

work page 2024
[22]

Information Fusion24, 108–121 (2015)

Oliveira, M., Santos, V., Sappa, A.D.: Multimodal inverse perspective mapping. Information Fusion24, 108–121 (2015)

work page 2015
[23]

Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks

Ren, T., Liu, S., Zeng, A., Lin, J., Li, K., Cao, H., Chen, J., Huang, X., Chen, Y., Yan, F., et al.: Grounded sam: Assembling open-world models for diverse visual tasks. arXiv preprint arXiv:2401.14159 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[24]

In: ICRA (2019)

Shi, G., Shi, X., O’Connell, M., Yu, R., Azizzadenesheli, K., Anandkumar, A., Yue, Y., Chung, S.J.: Neural Lander: Stable Drone Landing Control Using Learned Dynamics. In: ICRA (2019)

work page 2019
[25]

DINOv3

Siméoni, O., Vo, H.V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khali- dov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., et al.: Dinov3. arXiv preprint arXiv:2508.10104 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[26]

In: 2023 IEEE/RSJ Interna- tionalConferenceonIntelligentRobotsandSystems(IROS).pp.7742–7749(2023)

Stutts, A.C., Erricolo, D., Tulabandhula, T., Trivedi, A.R.: Lightweight, uncertainty-aware conformalized visual odometry. In: 2023 IEEE/RSJ Interna- tionalConferenceonIntelligentRobotsandSystems(IROS).pp.7742–7749(2023). https://doi.org/10.1109/IROS55552.2023.10341924

work page doi:10.1109/iros55552.2023.10341924 2023
[27]

In: IEEE CDC (2016)

Sumeet Singh, M.P., Slotine, J.J.: Tube-based mpc: a contraction theory approach. In: IEEE CDC (2016)

work page 2016
[28]

NeurIPS 36, 80324–80337 (2023)

Sun, J., Jiang, Y., Qiu, J., Nobel, P., Kochenderfer, M.J., Schwager, M.: Conformal prediction for uncertainty-aware planning with diffusion dynamics model. NeurIPS 36, 80324–80337 (2023)

work page 2023
[29]

Springer (2005)

Vovk, V., Gammerman, A., Shafer, G.: Algorithmic learning in a random world. Springer (2005)

work page 2005
[30]

Control Sys

Wang, H., Borquez, J., Bansal, S.: Providing safety assurances for systems with unknown dynamics. Control Sys. Letters (2024)

work page 2024
[31]

In: 2017 IEEE international conference on robotics and automation (ICRA)

Williams, G., Wagener, N., Goldfain, B., Drews, P., Rehg, J.M., Boots, B., Theodorou, E.A.: Information theoretic mpc for model-based reinforcement learn- ing. In: 2017 IEEE international conference on robotics and automation (ICRA). pp. 1714–1721. IEEE (2017)

work page 2017
[32]

Giaccagli, D

Yang, S., Pappas, G.J., Mangharam, R., Lindemann, L.: Safe perception-based control under stochastic sensor uncertainty using conformal prediction. In: 2023 62nd IEEE Conference on Decision and Control (CDC). pp. 6072–6078 (2023). https://doi.org/10.1109/CDC49753.2023.10384075

work page doi:10.1109/cdc49753.2023.10384075 2023
[33]

I2×2 ∆t I2×2 02×2 I2×2 # | {z } A pt vt +

Zhou, H., Zhang, Y., Luo, W.: Safety-critical control with uncertainty quantifica- tion using adaptive conformal prediction. In: ACC (2024) Local Conformal Calibration of Dynamics Uncertainty from Semantic Images 19 A Additional problem motivation In safety-critical scenarios, and for tasks requiring reasoning about uncertainty, user-set confidence levels...

work page 2024