Adaptive Conformal Prediction for Quantum Machine Learning

Douglas Spencer; Michele Caprio; Samual Nicholls

arxiv: 2511.18225 · v2 · pith:QNDZ7TL6new · submitted 2025-11-23 · 💻 cs.LG · stat.ML· stat.OT

Adaptive Conformal Prediction for Quantum Machine Learning

Douglas Spencer , Samual Nicholls , Michele Caprio This is my paper

Pith reviewed 2026-05-21 19:23 UTC · model grok-4.3

classification 💻 cs.LG stat.MLstat.OT

keywords quantum machine learningconformal predictionadaptive conformal inferencehardware noiseuncertainty quantificationprediction setsquantum computing

0 comments

The pith

Adaptive recalibration on streaming data restores conformal coverage guarantees under time-varying quantum noise.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces a method to keep prediction sets valid even as quantum hardware noise changes over time. Standard conformal prediction assumes exchangeable data, but quantum processors have drifting noise that breaks this. By repeatedly recalibrating on new data, the approach aims to maintain the promised coverage level on average. A sympathetic reader would care because reliable uncertainty estimates are essential for trusting quantum machine learning outputs in practice. The work shows through experiments on real hardware that this adaptive version achieves the target coverage where the non-adaptive version does not.

Core claim

The paper claims that Adaptive Quantum Conformal Prediction (AQCP) provides asymptotic average coverage guarantees under arbitrary hardware noise conditions by drawing on adaptive conformal inference and repeated recalibration.

What carries the argument

Adaptive Quantum Conformal Prediction (AQCP), an algorithm that maintains validity over time via repeated recalibration on streaming data.

If this is right

AQCP achieves the target coverage level on an IBM quantum processor.
AQCP exhibits greater stability than standard quantum conformal prediction.
Conformal guarantees hold on average despite non-stationary noise.
Repeated recalibration restores validity when exchangeability is approximate.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar recalibration strategies might benefit conformal prediction in other noisy or non-stationary settings beyond quantum hardware.
The observed stability gains suggest the method could support more reliable real-world deployment of quantum ML models.
Varying the recalibration window size in future tests could reveal optimal frequencies for different noise profiles.

Load-bearing premise

Repeated recalibration on streaming data can restore coverage validity even when the underlying quantum noise process is non-stationary and the calibration/test exchangeability is only approximate.

What would settle it

A long-term run of AQCP on a quantum processor with tracked time-varying noise, checking whether the empirical average coverage over many steps converges to the target probability.

Figures

Figures reproduced from arXiv: 2511.18225 by Douglas Spencer, Michele Caprio, Samual Nicholls.

**Figure 2.** Figure 2: Regression Model Shots (Simulated vs. ibm_sherbrooke). Comparison of 100,000 shots sampled from each backend (Qiskit Aer simulator and ibm_sherbrooke). The marker size is scaled proportionally to the count of overlapping shots at each location. The red lines represent the component mean functions µ(x) and −µ(x). We trained the angle encoder parameters using the TorchQuantum framework (Wang et al., 2022), … view at source ↗

**Figure 3.** Figure 3: presents the results obtained using the k-NN score function. The baseline of γ = 0 exhibits substantial deviations from the target coverage of α. For example, it over-covers between the 2,000–3,000 test points, and later under-covers around test point 8,500. In contrast, AQCP (γ = 0.03) shows greater stability. Once the initial rolling window is fully populated, AQCP consistently maintains the average cove… view at source ↗

**Figure 5.** Figure 5: (a) presents the efficiency results for our multimodal regression task. All score functions perform similarly for small shot numbers M ≤ 10, after which their behaviours diverge. sKDE and sHDR produce comparable average set sizes across all values of M, both showing a steady decline of average set size as M increases logarithmically. sHDR achieves the smallest average set size at M = 1,000. sKDE demonstrat… view at source ↗

read the original abstract

Quantum machine learning seeks to leverage quantum computers to improve upon classical machine learning algorithms. Currently, robust uncertainty quantification methods remain underdeveloped in the quantum domain, despite the critical need for reliable and trustworthy predictions. Recent work has introduced quantum conformal prediction, a framework that produces prediction sets that are guaranteed to contain the true outcome with a user-specified probability. In this work, we formalise how the time-varying noise inherent in quantum processors can undermine conformal guarantees, even when calibration and test data are exchangeable. To address this challenge, we draw on Adaptive Conformal Inference, a method which maintains validity over time via repeated recalibration. We introduce Adaptive Quantum Conformal Prediction (AQCP), an algorithm which provides asymptotic average coverage guarantees under arbitrary hardware noise conditions. Empirical studies on an IBM quantum processor demonstrate that AQCP achieves the target coverage level and exhibits greater stability than quantum conformal prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts classical adaptive conformal prediction to quantum ML to handle drifting hardware noise and shows better empirical stability on IBM devices, but the asymptotic coverage claim under arbitrary noise lacks visible justification for how fast or abruptly the noise can change.

read the letter

The main takeaway is that this work takes the adaptive conformal inference approach and applies it to quantum conformal prediction so that prediction sets stay valid as noise drifts on real hardware. They formalize how time-varying noise can break standard guarantees even with roughly exchangeable batches, then introduce AQCP with periodic recalibration and test it on an IBM processor where it holds coverage better than the static version. That empirical demonstration is the clearest contribution and addresses a practical barrier people actually run into when running quantum models over time. The observation that exchangeability alone is not enough under non-stationary hardware noise is fair and worth stating. The experiments give a concrete sense that recalibration helps in practice. The soft spot is the central guarantee. The claim of asymptotic average coverage under arbitrary noise is stated directly, but the abstract and available details do not spell out the conditions on how quickly or adversarially the noise process can vary. Standard adaptive conformal results usually require the distribution to change slowly enough for the recalibration window to track it or to have a well-behaved long-run average. Quantum noise like T1/T2 drift or crosstalk can jump between calibrations, and it is not obvious the average still converges. Without the derivation or a clear statement of the regularity conditions, the theoretical part feels thin. The empirical results would also be more convincing with error bars or repeated runs across different drift patterns. This paper is mainly for people working on uncertainty quantification for quantum machine learning on current noisy devices. A reader already familiar with conformal prediction who wants to see it applied to real hardware would get the most out of the empirical comparison. It is not a foundational theoretical advance, but it tackles a real deployment issue with a straightforward adaptation. I would send it for peer review rather than desk reject, with the expectation that the authors strengthen the conditions for the coverage guarantee and add more statistical detail to the experiments.

Referee Report

2 major / 3 minor

Summary. The paper introduces Adaptive Quantum Conformal Prediction (AQCP), an adaptation of adaptive conformal inference tailored to quantum machine learning. It formalizes how time-varying hardware noise can invalidate standard conformal guarantees even under approximate exchangeability, then proposes repeated recalibration to achieve asymptotic average coverage under arbitrary noise. Empirical results on IBM quantum hardware show that AQCP attains the target coverage level with greater stability than non-adaptive quantum conformal prediction.

Significance. If the asymptotic guarantee can be made rigorous under realistic quantum noise models, the work would meaningfully extend conformal prediction to the quantum setting, where non-stationary noise is a central practical obstacle. The explicit treatment of hardware noise and the real-device experiments are strengths; the absence of parameter-free derivations or machine-checked proofs is noted but does not diminish the empirical contribution.

major comments (2)

[§3] §3 (theoretical development): the statement that AQCP 'provides asymptotic average coverage guarantees under arbitrary hardware noise conditions' is not accompanied by the precise regularity conditions on the noise process (e.g., existence of a limiting average distribution or bounded variation rate) that are required for the standard adaptive conformal inference result to apply. Abrupt, instance-specific jumps in T1/T2 or crosstalk—common on IBM devices—can violate these conditions, so the long-run coverage claim does not automatically follow from the classical result. A concrete counter-example or additional assumption would be needed to support the central claim.
[§4] §4 (empirical evaluation): the reported coverage is described as 'achieving the target level' without error bars, number of independent runs, or explicit quantification of how the recalibration window interacts with finite-shot quantum measurement noise. This weakens the evidence that the method restores validity when exchangeability is only approximate inside each batch.

minor comments (3)

[§2] Notation for the recalibration window size and the quantum noise model should be introduced earlier and used consistently; currently the transition from classical adaptive conformal inference to the quantum case is abrupt.
[Figures 2-4] Figure captions should state the number of shots per circuit and the exact IBM backend used; this information is only in the text and is easy to miss.
[§4] A short discussion of computational overhead (extra circuit executions for recalibration) would help readers assess practicality on near-term hardware.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment below and outline the revisions we will incorporate to strengthen the manuscript.

read point-by-point responses

Referee: [§3] §3 (theoretical development): the statement that AQCP 'provides asymptotic average coverage guarantees under arbitrary hardware noise conditions' is not accompanied by the precise regularity conditions on the noise process (e.g., existence of a limiting average distribution or bounded variation rate) that are required for the standard adaptive conformal inference result to apply. Abrupt, instance-specific jumps in T1/T2 or crosstalk—common on IBM devices—can violate these conditions, so the long-run coverage claim does not automatically follow from the classical result. A concrete counter-example or additional assumption would be needed to support the central claim.

Authors: We agree that the central claim would be more precise with an explicit statement of the regularity conditions inherited from the adaptive conformal inference literature. In the revised manuscript we will add a dedicated paragraph in §3 that recalls the standard conditions (existence of a limiting average distribution together with a bounded-variation or ergodicity-type requirement on the noise process) and states that AQCP inherits the asymptotic average-coverage guarantee whenever these conditions hold for the underlying hardware-noise sequence. We will further note that typical IBM-device noise (slowly drifting T1/T2 and crosstalk) satisfies the conditions in practice, while acknowledging that sufficiently abrupt, non-stationary jumps could violate them; this limitation will be discussed explicitly rather than claiming the result for completely arbitrary noise. revision: yes
Referee: [§4] §4 (empirical evaluation): the reported coverage is described as 'achieving the target level' without error bars, number of independent runs, or explicit quantification of how the recalibration window interacts with finite-shot quantum measurement noise. This weakens the evidence that the method restores validity when exchangeability is only approximate inside each batch.

Authors: We concur that the empirical section would be strengthened by more rigorous statistical reporting. In the revision we will (i) report results from multiple independent experimental runs (at least five full repetitions on the same IBM backend), (ii) add error bars or shaded regions to all coverage plots, and (iii) include a short analysis (or supplementary figure) that varies the recalibration-window length and quantifies its interaction with finite-shot measurement noise. These additions will provide clearer evidence that AQCP restores validity under the approximate exchangeability present in each batch. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation adapts external classical method without reduction to self-inputs.

full rationale

The paper's central claim of asymptotic average coverage for AQCP under arbitrary hardware noise is presented as an adaptation of existing Adaptive Conformal Inference results to the quantum setting with time-varying noise. No equations, self-citations, or steps in the abstract or described derivation reduce the coverage guarantee to a fitted parameter, self-definition, or prior author work that itself assumes the target result. The method relies on repeated recalibration as an external technique whose validity properties are imported rather than re-derived within the paper. This is a standard, non-circular extension against external benchmarks, consistent with the reader's assessment of no obvious circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the classical adaptive conformal inference framework plus an unstated model of how quantum noise evolves; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Exchangeability of calibration and test data holds approximately despite time-varying quantum noise.
Invoked when stating that standard conformal guarantees can be undermined by hardware noise.

pith-pipeline@v0.9.0 · 5677 in / 1119 out tokens · 41897 ms · 2026-05-21T19:23:20.945652+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce Adaptive Quantum Conformal Prediction (AQCP), an algorithm which provides asymptotic average coverage guarantees under arbitrary hardware noise conditions.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the non-stationary nature of hardware noise in current-generation quantum processors

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 3 internal anchors

[1]

Theoretical Foundations of Conformal Prediction

URLhttps://arxiv.org/abs/2411.11824. Rina Foygel Barber, Emmanuel J Candes, Aaditya Ramdas, and Ryan J Tibshirani. Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845,

work page internal anchor Pith review Pith/arXiv arXiv
[2]

[Online]

doi: 10.1038/nature23474. Sergey Bravyi, Sarah Sheldon, Abhinav Kandala, David C Mckay, and Jay M Gambetta. Mitigating mea- surement errors in multiqubit experiments.Physical Review A, 103(4):042605,

work page doi:10.1038/nature23474
[3]

Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter

URLhttps://arxiv.org/ abs/2511.15969. Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and accurate deep network learning by exponential linear units (elus).arXiv preprint arXiv:1511.07289, 4(5):11,

work page arXiv
[4]

Stability of noisy quantum computing devices.arXiv preprint arXiv:2105.09472,

Samudra Dasgupta and Travis S Humble. Stability of noisy quantum computing devices.arXiv preprint arXiv:2105.09472,

work page arXiv
[5]

APACrefauthors \ 2020

Arun Kumar Kuchibhotla. Exchangeability, conformal prediction, and rank tests.arXiv preprint arXiv:2005.06095,

work page arXiv 2005
[6]

Distribution-Free Predictive Inference For Regression

URLhttps://arxiv.org/abs/1604.04173. Lorenzo Leone, Salvatore FE Oliviero, Lukasz Cincio, and Marco Cerezo. On the practical usefulness of the hardware efficient ansatz.Quantum, 8:1395,

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Unsupervised Machine Learning on a Hybrid Quantum Computer

Johannes S Otterbach, Riccardo Manenti, Nasser Alidoust, A Bestwick, M Block, B Bloom, S Caldwell, N Didier, E Schuyler Fried, S Hong, et al. Unsupervised machine learning on a hybrid quantum computer. arXiv preprint arXiv:1712.05771,

work page internal anchor Pith review Pith/arXiv arXiv
[8]

URLhttps:// doi.org/10.1088/1742-6596/2634/1/012043

doi: 10.1088/1742-6596/2634/1/012043. URLhttps:// doi.org/10.1088/1742-6596/2634/1/012043. Minati Rath and Hema Date. Quantum data encoding: A comparative analysis of classical-to-quantum mapping techniques and their impact on machine learning accuracy.EPJ Quantum Technology, 11(1):72,

work page doi:10.1088/1742-6596/2634/1/012043
[9]

Osvaldo Simeone et al

doi: 10.1109/ICECA.2018.8474918. Osvaldo Simeone et al. An introduction to quantum machine learning for engineers.Foundations and Trends in Signal Processing, 16(1-2):1–223,

work page doi:10.1109/iceca.2018.8474918 2018
[10]

Quantum-enhanced conformal methods for multi-output uncertainty: A holistic exploration and experimental analysis.arXiv preprint arXiv:2501.10414,

Emre Tasar. Quantum-enhanced conformal methods for multi-output uncertainty: A holistic exploration and experimental analysis.arXiv preprint arXiv:2501.10414,

work page arXiv
[11]

doi: 10.1007/s12027-020-00602-0

ISSN 1863-9038. doi: 10.1007/s12027-020-00602-0. URLhttps://doi.org/10.1007/ s12027-020-00602-0. Matteo Zecchin, Sangwoo Park, and Osvaldo Simeone. Forking uncertainties: Reliable prediction and model predictive control with sequence models via conformal risk control,

work page doi:10.1007/s12027-020-00602-0
[12]

Puning Zhao and Lifeng Lai

URLhttps://arxiv.org/abs/ 2310.10299. Puning Zhao and Lifeng Lai. Analysis of knn density estimation.IEEE Transactions on Information Theory, 68(12):7971–7995,

work page arXiv
[13]

(2023) making the necessary adaptions to our setting

20 A Appendix: Conformal Prediction Beyond Exchangeability We follow the procedure in Barber et al. (2023) making the necessary adaptions to our setting. Denote Zi = (Xi,Yi;AXi,Ti). A weightw i∈[0,1]is assigned to each data point to quantify its similarity to a given test point. These weights can be derived from various metrics, such as the temporal gap b...

work page 2023
[14]

Their method protects against shifts in the distributions of theZi’s and, as a consequence, shifts in the scores

th observations swapped,S(z)∈Rn+1 is the residual vector with entries(S(z))i = ˆS(xi,yi;Axi,Ti), and dTV denotes the total variation distance. Their method protects against shifts in the distributions of theZi’s and, as a consequence, shifts in the scores. The method provides a coverage guarantee of at least1−αminus a specific correction term. This correc...

work page 2025
[15]

In the following section, we characterise the optimal score function class for optimisation problem 1 and 2 and develop practical estimators for our quantum learning setting

This motivates the definition of score function classesSi such that, for someλ∈R, the setCλ solves the corresponding optimisation problemi. In the following section, we characterise the optimal score function class for optimisation problem 1 and 2 and develop practical estimators for our quantum learning setting. 22 B.2 Optimal Scores for Marginal Coverag...

work page 2014
[16]

In many machine learning settings, any of these forms can be implemented directly

produces the negative log density score. In many machine learning settings, any of these forms can be implemented directly. For example, they are natural for classification tasks with softmax probabilities over output classes (Angelopoulos et al., 2023), and they are also appropriate for regression when the model provides an explicit conditional density f...

work page 2023
[17]

This theory assumes access to samples taken from the true conditional distributions, however, it provides motivation the general case. B.3 Optimal Scores for Conditional Coverage (S2) For the conditional guarantee optimisation problem, a similar approach can be taken, but more machinery is required to characterise the full family of optimal score function...

work page 1996

[1] [1]

Theoretical Foundations of Conformal Prediction

URLhttps://arxiv.org/abs/2411.11824. Rina Foygel Barber, Emmanuel J Candes, Aaditya Ramdas, and Ryan J Tibshirani. Conformal prediction beyond exchangeability.The Annals of Statistics, 51(2):816–845,

work page internal anchor Pith review Pith/arXiv arXiv

[2] [2]

[Online]

doi: 10.1038/nature23474. Sergey Bravyi, Sarah Sheldon, Abhinav Kandala, David C Mckay, and Jay M Gambetta. Mitigating mea- surement errors in multiqubit experiments.Physical Review A, 103(4):042605,

work page doi:10.1038/nature23474

[3] [3]

Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter

URLhttps://arxiv.org/ abs/2511.15969. Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and accurate deep network learning by exponential linear units (elus).arXiv preprint arXiv:1511.07289, 4(5):11,

work page arXiv

[4] [4]

Stability of noisy quantum computing devices.arXiv preprint arXiv:2105.09472,

Samudra Dasgupta and Travis S Humble. Stability of noisy quantum computing devices.arXiv preprint arXiv:2105.09472,

work page arXiv

[5] [5]

APACrefauthors \ 2020

Arun Kumar Kuchibhotla. Exchangeability, conformal prediction, and rank tests.arXiv preprint arXiv:2005.06095,

work page arXiv 2005

[6] [6]

Distribution-Free Predictive Inference For Regression

URLhttps://arxiv.org/abs/1604.04173. Lorenzo Leone, Salvatore FE Oliviero, Lukasz Cincio, and Marco Cerezo. On the practical usefulness of the hardware efficient ansatz.Quantum, 8:1395,

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Unsupervised Machine Learning on a Hybrid Quantum Computer

Johannes S Otterbach, Riccardo Manenti, Nasser Alidoust, A Bestwick, M Block, B Bloom, S Caldwell, N Didier, E Schuyler Fried, S Hong, et al. Unsupervised machine learning on a hybrid quantum computer. arXiv preprint arXiv:1712.05771,

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

URLhttps:// doi.org/10.1088/1742-6596/2634/1/012043

doi: 10.1088/1742-6596/2634/1/012043. URLhttps:// doi.org/10.1088/1742-6596/2634/1/012043. Minati Rath and Hema Date. Quantum data encoding: A comparative analysis of classical-to-quantum mapping techniques and their impact on machine learning accuracy.EPJ Quantum Technology, 11(1):72,

work page doi:10.1088/1742-6596/2634/1/012043

[9] [9]

Osvaldo Simeone et al

doi: 10.1109/ICECA.2018.8474918. Osvaldo Simeone et al. An introduction to quantum machine learning for engineers.Foundations and Trends in Signal Processing, 16(1-2):1–223,

work page doi:10.1109/iceca.2018.8474918 2018

[10] [10]

Quantum-enhanced conformal methods for multi-output uncertainty: A holistic exploration and experimental analysis.arXiv preprint arXiv:2501.10414,

Emre Tasar. Quantum-enhanced conformal methods for multi-output uncertainty: A holistic exploration and experimental analysis.arXiv preprint arXiv:2501.10414,

work page arXiv

[11] [11]

doi: 10.1007/s12027-020-00602-0

ISSN 1863-9038. doi: 10.1007/s12027-020-00602-0. URLhttps://doi.org/10.1007/ s12027-020-00602-0. Matteo Zecchin, Sangwoo Park, and Osvaldo Simeone. Forking uncertainties: Reliable prediction and model predictive control with sequence models via conformal risk control,

work page doi:10.1007/s12027-020-00602-0

[12] [12]

Puning Zhao and Lifeng Lai

URLhttps://arxiv.org/abs/ 2310.10299. Puning Zhao and Lifeng Lai. Analysis of knn density estimation.IEEE Transactions on Information Theory, 68(12):7971–7995,

work page arXiv

[13] [13]

(2023) making the necessary adaptions to our setting

20 A Appendix: Conformal Prediction Beyond Exchangeability We follow the procedure in Barber et al. (2023) making the necessary adaptions to our setting. Denote Zi = (Xi,Yi;AXi,Ti). A weightw i∈[0,1]is assigned to each data point to quantify its similarity to a given test point. These weights can be derived from various metrics, such as the temporal gap b...

work page 2023

[14] [14]

Their method protects against shifts in the distributions of theZi’s and, as a consequence, shifts in the scores

th observations swapped,S(z)∈Rn+1 is the residual vector with entries(S(z))i = ˆS(xi,yi;Axi,Ti), and dTV denotes the total variation distance. Their method protects against shifts in the distributions of theZi’s and, as a consequence, shifts in the scores. The method provides a coverage guarantee of at least1−αminus a specific correction term. This correc...

work page 2025

[15] [15]

In the following section, we characterise the optimal score function class for optimisation problem 1 and 2 and develop practical estimators for our quantum learning setting

This motivates the definition of score function classesSi such that, for someλ∈R, the setCλ solves the corresponding optimisation problemi. In the following section, we characterise the optimal score function class for optimisation problem 1 and 2 and develop practical estimators for our quantum learning setting. 22 B.2 Optimal Scores for Marginal Coverag...

work page 2014

[16] [16]

In many machine learning settings, any of these forms can be implemented directly

produces the negative log density score. In many machine learning settings, any of these forms can be implemented directly. For example, they are natural for classification tasks with softmax probabilities over output classes (Angelopoulos et al., 2023), and they are also appropriate for regression when the model provides an explicit conditional density f...

work page 2023

[17] [17]

This theory assumes access to samples taken from the true conditional distributions, however, it provides motivation the general case. B.3 Optimal Scores for Conditional Coverage (S2) For the conditional guarantee optimisation problem, a similar approach can be taken, but more machinery is required to characterise the full family of optimal score function...

work page 1996