pith. machine review for the scientific record. sign in

arxiv: 2605.13028 · v1 · submitted 2026-05-13 · 💻 cs.RO · cs.SY· eess.SY

Recognition: unknown

Local Conformal Calibration of Dynamics Uncertainty from Semantic Images

Authors on Pith no claims yet

Pith reviewed 2026-05-14 19:13 UTC · model grok-4.3

classification 💻 cs.RO cs.SYeess.SY
keywords conformal predictionuncertainty quantificationrobot dynamicssemantic imagescalibrationsafe planningepistemic uncertaintyaleatoric uncertainty
0
0 comments X

The pith

OCULAR uses semantic images to calibrate any linear Gaussian dynamics model and guarantee that future states fall inside user-chosen probability regions in unseen environments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces OCULAR, which groups calibration data from environments whose semantic images look alike and uses that grouping to adjust uncertainty bounds on a given linear Gaussian dynamics model. The resulting prediction regions are guaranteed to contain the true next state with probability at least 1 minus a user-chosen value, even when the model is imperfect and the test environment differs from the calibration data. A sympathetic reader would care because the guarantees are non-asymptotic and distribution-free, so they apply directly to robot planning without requiring exact environment matches or strong assumptions on the real dynamics. The method also flags which observation-velocity-action triples produce larger uncertainty, supporting safer action selection.

Core claim

OCULAR performs local conformal calibration by matching semantic images to select relevant calibration trajectories, then produces prediction sets around the nominal next-state mean whose size accounts for both process noise and model mismatch; these sets contain the true future state with probability at least 1-epsilon for any user epsilon, regardless of the fidelity of the supplied linear Gaussian model.

What carries the argument

Local conformal calibration that selects calibration samples via semantic-image similarity to adjust next-state prediction intervals.

If this is right

  • The calibrated uncertainty can be used inside a planner to avoid actions whose observation-velocity-action inputs produce large prediction regions.
  • Guarantees remain valid for out-of-distribution test scenes provided the scenes share semantic appearance with the calibration data.
  • The procedure works for dynamics models of arbitrary fidelity, from coarse to highly accurate linear Gaussian approximations.
  • It produces smaller average prediction volumes than methods that must collect data in the exact target environment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Perception similarity can act as a practical proxy for dynamics similarity when collecting calibration data for new settings.
  • The same grouping idea could be tested on real robots to check whether camera-based calibration transfers across physical hardware changes.
  • The distinction between high- and low-uncertainty inputs might be combined with reachability analysis to produce explicit safety certificates.

Load-bearing premise

Data from environments whose semantic images are similar can be used to calibrate the linear Gaussian model so that the resulting bounds remain valid in any new test environment that is also visually similar.

What would settle it

Apply OCULAR to a test environment whose camera images match the calibration set but whose true next-state distribution lies outside the computed prediction regions more often than the target probability.

Figures

Figures reproduced from arXiv: 2605.13028 by Dmitry Berenson, Lu\'is Marques.

Figure 1
Figure 1. Figure 1: Offline component of OCULAR. 1: ot from D part cal are projected into a planar footprint o ′ t by φ. A CAE is trained to reconstruct o ′ t , and the decoder is discarded. 2: All data in Dcal is processed by process() into a learned rep￾resentation Xi , and nonconformity scores Ri are computed. 3: a Decision Tree is trained on D part cal to partition the learned input space X into regions of ap￾proximately … view at source ↗
Figure 2
Figure 2. Figure 2: Online component of OCULAR. Given an estimated Gaussian at time t, a desired action at, and observation ot, we create an approximate next-step Gaussian N˜ t+1 via the approximate model ˜f. The current-time information Xraw i is processed, and the learned representation Xi passed to the Decision Tree. The resulting leaf node Xk has an associated qˆk which is multiplied by a fixed constant to get ξk. The app… view at source ↗
Figure 3
Figure 3. Figure 3: Rollouts in map U of the planar robot (see App. D.1 for all maps). The [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Rollouts in icyMain of Isaac Sim experiment (see App. D.2 for other [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Trajectories of OCULAR and baselines in the planar environment, across 30 runs for each map-method combo. The uncalibrated baseline gains too much momentum in the low-friction region, leading to collisions. SplitCP is too conservative, with all its sampled plans being in collision. This results in goal-chasing behavior and an inability to avoid obstacles. LUCCa and OCU￾LAR have comparable performance, slow… view at source ↗
Figure 6
Figure 6. Figure 6: The three tested Isaac Sim environments (based off Rivermark). [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Example observation and observed map in the Isaac Sim environment. [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of all methods in the Isaac environment. [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
read the original abstract

We introduce Observation-aware Conformal Uncertainty Local-Calibration (OCULAR), a conformal prediction-based algorithm that uses perception information to provide uncertainty quantification guarantees for unseen test-time environments. While previous conformal approaches lack the ability to discriminate between state-action space regions leading to higher or lower model mismatch, and require environment-specific data, our method uses data collected from visually similar environments to provably calibrate a given linear Gaussian dynamics model of arbitrary fidelity. The prediction regions generated from OCULAR are guaranteed to contain the future system states with, at least, a user-set likelihood, despite both aleatoric and epistemic uncertainty -- i.e., uncertainty arising from both stochastic disturbances and lack of data. Our guarantees are non-asymptotic and distribution-free, not requiring strong assumptions about the unknown real system dynamics. Our calibration procedure enables distinguishing between observation-velocity-action inputs leading to higher and lower next-state-uncertainty, which is helpful for probabilistically-safe planning. We numerically validate our algorithm on a double-integrator system subject to random perturbations and significant model mismatch, using both a simplified sensor and a more realistic simulated camera. Our approach appropriately quantifies uncertainty both when in-distribution and out-of-distribution, being comparatively volume-efficient to baselines requiring environment-specific data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces OCULAR, a conformal-prediction algorithm that uses semantic images to identify visually similar environments and locally calibrate a linear-Gaussian dynamics model of arbitrary fidelity. It claims non-asymptotic, distribution-free marginal coverage guarantees for future states under both aleatoric and epistemic uncertainty, without requiring environment-specific calibration data, and demonstrates the approach on a double-integrator subject to perturbations and model mismatch using both simplified and camera-based sensors.

Significance. If the coverage transfer via semantic similarity can be rigorously established, the result would enable practical uncertainty quantification for robotics in novel environments where collecting matched calibration data is costly or impossible, while also providing spatially varying uncertainty estimates useful for probabilistically safe planning.

major comments (2)
  1. [§3 (theoretical guarantees)] The central coverage claim rests on exchangeability between the semantically selected calibration set and the test points. The manuscript does not derive that perceptual similarity (via semantic images) implies the required exchangeability or bounds the total-variation distance between the induced residual distributions; without this step the non-asymptotic guarantee does not transfer to unseen environments.
  2. [§5 (numerical experiments)] In the numerical validation on the double-integrator, the reported coverage is shown for both in-distribution and out-of-distribution cases, yet the paper does not quantify how the semantic-similarity threshold affects the empirical coverage gap or the volume of the resulting prediction sets; this leaves open whether the method remains volume-efficient when the visual similarity metric is imperfect.
minor comments (2)
  1. [§2–3] Notation for the conformal score and the local calibration set should be introduced with a single consistent symbol table rather than being redefined inline in multiple sections.
  2. [§5] Figure captions for the camera-based sensor experiments should explicitly state the semantic similarity metric and the number of calibration environments used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. The feedback highlights opportunities to strengthen the presentation of the theoretical assumptions and to expand the experimental sensitivity analysis. We address each major comment below and indicate the planned revisions.

read point-by-point responses
  1. Referee: [§3 (theoretical guarantees)] The central coverage claim rests on exchangeability between the semantically selected calibration set and the test points. The manuscript does not derive that perceptual similarity (via semantic images) implies the required exchangeability or bounds the total-variation distance between the induced residual distributions; without this step the non-asymptotic guarantee does not transfer to unseen environments.

    Authors: We agree that an explicit statement of the exchangeability assumption is needed for clarity. In the revised manuscript we will insert a dedicated paragraph in §3 that states the coverage guarantee holds conditionally on the semantic similarity selection, under the modeling assumption that points sharing the same semantic class are exchangeable. This is the standard localization assumption in conformal methods and preserves the distribution-free, non-asymptotic character of the result. We deliberately avoid total-variation bounds because they would require parametric assumptions on the residual distributions, contradicting the paper’s goal of distribution-free guarantees. The selection procedure itself is deterministic given the images, so the marginal coverage statement remains valid over the joint distribution of calibration and test points that pass the similarity filter. revision: partial

  2. Referee: [§5 (numerical experiments)] In the numerical validation on the double-integrator, the reported coverage is shown for both in-distribution and out-of-distribution cases, yet the paper does not quantify how the semantic-similarity threshold affects the empirical coverage gap or the volume of the resulting prediction sets; this leaves open whether the method remains volume-efficient when the visual similarity metric is imperfect.

    Authors: We thank the referee for this observation. In the revised §5 we will add two new figures that sweep the similarity threshold over a range of values and plot (i) the empirical coverage gap relative to the nominal level and (ii) the average volume of the prediction sets, separately for the simplified-sensor and camera-based experiments. These plots will include both in-distribution and out-of-distribution test conditions and will demonstrate that coverage remains within a small additive gap of the target while prediction-set volume grows gracefully as the threshold is relaxed, confirming volume efficiency even under imperfect similarity. revision: yes

Circularity Check

0 steps flagged

OCULAR applies standard conformal prediction to semantically similar environments; guarantees derive from exchangeability without reduction to fitted inputs or self-citations

full rationale

The derivation relies on established conformal prediction results for non-asymptotic distribution-free coverage under exchangeability, applied to calibration data selected by semantic similarity. No equations define the prediction regions in terms of the target test quantities themselves, nor do any load-bearing steps reduce to parameters fitted from the same data or to self-citations whose validity depends on the present work. The linear-Gaussian model calibration and local uncertainty distinction follow directly from the conformal quantile construction on the selected sets, preserving independence from the test distribution. This is self-contained against external benchmarks of conformal prediction theory.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Relies on standard conformal prediction exchangeability assumptions and the domain assumption that visual similarity in semantic images correlates with dynamics similarity sufficient for calibration.

axioms (2)
  • standard math Conformal prediction provides valid coverage under exchangeability of calibration and test points
    Core property invoked for the non-asymptotic guarantees.
  • domain assumption Visually similar environments provide data that can calibrate the linear Gaussian model for the target environment
    Central premise allowing use of perception data instead of environment-specific collection.

pith-pipeline@v0.9.0 · 5515 in / 1258 out tokens · 44173 ms · 2026-05-14T19:13:22.012873+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 2 internal anchors

  1. [1]

    In: CDC (2021)

    Agrawal, D.R., Panagou, D.: Safe control synthesis via input constrained control barrier functions. In: CDC (2021)

  2. [2]

    In: 2019 American Control Conference (ACC)

    Alsterda, J.P., Brown, M., Gerdes, J.C.: Contingency model predictive control for automated vehicles. In: 2019 American Control Conference (ACC). pp. 717–722 (2019).https://doi.org/10.23919/ACC.2019.8815260

  3. [3]

    In: 2019 18th European control conference (ECC)

    Ames, A.D., Coogan, S., Egerstedt, M., Notomista, G., Sreenath, K., Tabuada, P.: Control barrier functions: Theory and applications. In: 2019 18th European control conference (ECC). pp. 3420–3431. Ieee (2019)

  4. [4]

    Angelopoulos, A.N., Barber, R.F., Bates, S.: Theoretical foundations of conformal prediction (2025),https://arxiv.org/abs/2411.11824

  5. [5]

    In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC)

    Bansal, S., Chen, M., Herbert, S., Tomlin, C.J.: Hamilton-jacobi reachability: A brief overview and recent advances. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC). pp. 2242–2253. IEEE (2017)

  6. [6]

    IEEE Transactions on Robotics27(6), 1080–1094 (2011)

    Blackmore, L., Ono, M., Williams, B.C.: Chance-constrained optimal path plan- ning with obstacles. IEEE Transactions on Robotics27(6), 1080–1094 (2011). https://doi.org/10.1109/TRO.2011.2161160

  7. [7]

    Routledge (2017)

    Breiman, L.: Classification and Regression Trees. Routledge (2017)

  8. [8]

    Information Sciences686, 121369 (2025)

    Cabezas, L.M., Otto, M.P., Izbicki, R., Stern, R.B.: Regression trees for fast and adaptive prediction intervals. Information Sciences686, 121369 (2025)

  9. [9]

    RSS (2023)

    Cosner, R.K., Culbertson, P., Taylor, A.J., Ames, A.D.: Robust safety under stochastic uncertainty with discrete-time control barrier functions. RSS (2023)

  10. [10]

    In: Matni, N., Morari, M., Pappas, G.J

    Dixit, A., Lindemann, L., Wei, S.X., Cleaveland, M., Pappas, G.J., Burdick, J.W.: Adaptive conformal prediction for motion planning among dynamic agents. In: Matni, N., Morari, M., Pappas, G.J. (eds.) Proceedings of The 5th Annual Learn- ing for Dynamics and Control Conference. Proceedings of Machine Learning Re- search, vol. 211, pp. 300–314. PMLR (15–16...

  11. [11]

    Ad- vances in Neural Information Processing Systems34, 1660–1672 (2021)

    Gibbs, I., Candes, E.: Adaptive conformal inference under distribution shift. Ad- vances in Neural Information Processing Systems34, 1660–1672 (2021)

  12. [12]

    Hartley, R.: Multiple view geometry in computer vision, vol. 665. Cambridge uni- versity press (2003)

  13. [13]

    In: ICRA (2021)

    Khan, M., Ibuki, T., Chatterjee, A.: Safety uncertainty in control barrier functions using gaussian processes. In: ICRA (2021)

  14. [14]

    arXiv preprint arXiv:2005.06095 (2020)

    Kuchibhotla, A.K.: Exchangeability, conformal prediction, and rank tests. arXiv preprint arXiv:2005.06095 (2020)

  15. [15]

    Cambridge university press (2006)

    LaValle, S.M.: Planning algorithms. Cambridge university press (2006)

  16. [16]

    Journal of the Royal Statistical Society Series B: Statistical Methodology 76(1), 71–96 (2014)

    Lei, J., Wasserman, L.: Distribution-free prediction bands for non-parametric re- gression. Journal of the Royal Statistical Society Series B: Statistical Methodology 76(1), 71–96 (2014)

  17. [17]

    IEEE Robotics and Automation Letters 8(8), 5116–5123 (2023)

    Lindemann, L., Cleaveland, M., Shim, G., Pappas, G.J.: Safe planning in dynamic environments using conformal prediction. IEEE Robotics and Automation Letters 8(8), 5116–5123 (2023)

  18. [18]

    IJRR (2024)

    Luo, R., Zhao, S., Kuck, J., Ivanovic, B., Savarese, S., Schmerling, E., Pavone, M.: Sample-efficient safety assurances using conformal prediction. IJRR (2024)

  19. [19]

    In: Algorithmic Foundations of Robotics XVI, Volume 2

    Marques, L., Berenson, D.: Quantifying aleatoric and epistemic dynamics uncer- tainty via local conformal calibration. In: Algorithmic Foundations of Robotics XVI, Volume 2. pp. 85–103. Springer Nature Switzerland, Cham (2026) 18 L. Marques, and D. Berenson

  20. [20]

    IEEE Robotics and Automation Letters pp

    Marques, L., Ghaffari, M., Berenson, D.: Lies we can trust: Quantifying ac- tion uncertainty with inaccurate stochastic dynamics through conformalized non- holonomic lie groups. IEEE Robotics and Automation Letters pp. 1–8 (2026). https://doi.org/10.1109/LRA.2026.3656773

  21. [21]

    The International Journal of Robotics Research p

    Mei, Z., Dixit, A., Booker, M., Zhou, E., Storey-Matsutani, M., Ren, A.Z., Shorinwa, O., Majumdar, A.: Perceive with confidence: Statistical safety assur- ances for navigation with learning-based perception. The International Journal of Robotics Research p. 02783649251378151 (2024)

  22. [22]

    Information Fusion24, 108–121 (2015)

    Oliveira, M., Santos, V., Sappa, A.D.: Multimodal inverse perspective mapping. Information Fusion24, 108–121 (2015)

  23. [23]

    Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks

    Ren, T., Liu, S., Zeng, A., Lin, J., Li, K., Cao, H., Chen, J., Huang, X., Chen, Y., Yan, F., et al.: Grounded sam: Assembling open-world models for diverse visual tasks. arXiv preprint arXiv:2401.14159 (2024)

  24. [24]

    In: ICRA (2019)

    Shi, G., Shi, X., O’Connell, M., Yu, R., Azizzadenesheli, K., Anandkumar, A., Yue, Y., Chung, S.J.: Neural Lander: Stable Drone Landing Control Using Learned Dynamics. In: ICRA (2019)

  25. [25]

    DINOv3

    Siméoni, O., Vo, H.V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khali- dov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., et al.: Dinov3. arXiv preprint arXiv:2508.10104 (2025)

  26. [26]

    In: 2023 IEEE/RSJ Interna- tionalConferenceonIntelligentRobotsandSystems(IROS).pp.7742–7749(2023)

    Stutts, A.C., Erricolo, D., Tulabandhula, T., Trivedi, A.R.: Lightweight, uncertainty-aware conformalized visual odometry. In: 2023 IEEE/RSJ Interna- tionalConferenceonIntelligentRobotsandSystems(IROS).pp.7742–7749(2023). https://doi.org/10.1109/IROS55552.2023.10341924

  27. [27]

    In: IEEE CDC (2016)

    Sumeet Singh, M.P., Slotine, J.J.: Tube-based mpc: a contraction theory approach. In: IEEE CDC (2016)

  28. [28]

    NeurIPS 36, 80324–80337 (2023)

    Sun, J., Jiang, Y., Qiu, J., Nobel, P., Kochenderfer, M.J., Schwager, M.: Conformal prediction for uncertainty-aware planning with diffusion dynamics model. NeurIPS 36, 80324–80337 (2023)

  29. [29]

    Springer (2005)

    Vovk, V., Gammerman, A., Shafer, G.: Algorithmic learning in a random world. Springer (2005)

  30. [30]

    Control Sys

    Wang, H., Borquez, J., Bansal, S.: Providing safety assurances for systems with unknown dynamics. Control Sys. Letters (2024)

  31. [31]

    In: 2017 IEEE international conference on robotics and automation (ICRA)

    Williams, G., Wagener, N., Goldfain, B., Drews, P., Rehg, J.M., Boots, B., Theodorou, E.A.: Information theoretic mpc for model-based reinforcement learn- ing. In: 2017 IEEE international conference on robotics and automation (ICRA). pp. 1714–1721. IEEE (2017)

  32. [32]

    Giaccagli, D

    Yang, S., Pappas, G.J., Mangharam, R., Lindemann, L.: Safe perception-based control under stochastic sensor uncertainty using conformal prediction. In: 2023 62nd IEEE Conference on Decision and Control (CDC). pp. 6072–6078 (2023). https://doi.org/10.1109/CDC49753.2023.10384075

  33. [33]

    I2×2 ∆t I2×2 02×2 I2×2 # | {z } A pt vt +

    Zhou, H., Zhang, Y., Luo, W.: Safety-critical control with uncertainty quantifica- tion using adaptive conformal prediction. In: ACC (2024) Local Conformal Calibration of Dynamics Uncertainty from Semantic Images 19 A Additional problem motivation In safety-critical scenarios, and for tasks requiring reasoning about uncertainty, user-set confidence levels...