Distribution-free predictive inference for individual treatment effects is impossible: any valid set must have infinite expected length under standard assumptions with continuous covariates.
hub
Theoretical Foundations of Conformal Prediction
39 Pith papers cite this work. Polarity classification is still indexing.
abstract
This book is about conformal prediction and related inferential techniques that build on permutation tests and exchangeability. These techniques are useful in a diverse array of tasks, including hypothesis testing and providing uncertainty quantification guarantees for machine learning systems. Much of the current interest in conformal prediction is due to its ability to integrate into complex machine learning workflows, solving the problem of forming prediction sets without any assumptions on the form of the data generating distribution. Since contemporary machine learning algorithms have generally proven difficult to analyze directly, conformal prediction's main appeal is its ability to provide formal, finite-sample guarantees when paired with such methods. The goal of this book is to teach the reader about the fundamental technical arguments that arise when researching conformal prediction and related questions in distribution-free inference. Many of these proof strategies, especially the more recent ones, are scattered among research papers, making it difficult for researchers to understand where to look, which results are important, and how exactly the proofs work. We hope to bridge this gap by curating what we believe to be some of the most important results in the literature and presenting their proofs in a unified language, with illustrations, and with an eye towards pedagogy.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
A noise-based orthogonalization framework for valid hypothesis testing that extends to post-selection inference under mild assumptions.
The relaxation to permutation invariance in distribution is shown to be insufficient for full conformal prediction validity under stochastic non-conformity measures, and Conditional Independence & Permutation Invariance in Distribution is provided as the correct sufficient condition.
Conformal language modeling samples from posterior approximations conditioned on high-scoring regions to achieve risk control with higher utility than post-hoc filtering in open-ended text generation.
Develops a weighted conformal clustering method that corrects for synthetic labels via conditional distribution shift to achieve finite-sample marginal coverage with explicit bounds for estimated weights.
The leave-a-window-out jackknife achieves valid predictive coverage in time series under mild model stability by introducing coefficients that quantify departure from cyclic exchangeability.
Two novel online conformal prediction algorithms enforce nested prediction sets across coverage levels using online optimization with regret bounds for quantile error control.
Non-asymptotic decomposition of conditional miscoverage in conformal prediction into score-estimation error, finite-sample calibration error, and intrinsic conditional-mismatch error, with guidance for model selection and extensions to covariate shift and structured data.
Trade-off functions between two distributions are finitely testable if and only if their Neyman-Pearson rejection regions are attainable by a VC-class of sets.
Risk-controlled post-processing yields a threshold-structured policy that follows the baseline except where an oracle fallback sharply reduces conditional violation risk, achieving O(log n/n) expected excess risk in i.i.d. settings and exact risk control under exchangeability.
Split conformal clustering with stochastic labels provides finite-sample marginal coverage guarantees for cluster label confidence sets, controlled by soft-label consistency and replace-one stability of the clustering algorithm.
Conformal risk control for bounded non-monotone losses over a grid of size m achieves excess risk of order sqrt(log m / n) with n calibration samples, which is minimax optimal.
ST-BCP tightens the coverage bound in Backward Conformal Prediction by applying a computable data-dependent transformation to nonconformity scores, reducing the average gap from 4.20% to 1.12% on benchmarks while proving superiority over the identity baseline.
PAC-Bayesian bounds are derived for quadratic closed-loop control via SLS parameterization, yielding Chernoff certificates for posteriors over responses, a mean-response deployment result, and a data-driven learning algorithm.
Machine learning models recover most warm-rain and ice microphysical process rates from standard ICON model outputs for accumulation intervals of 10 minutes or less using a two-step classification-regression approach with calibrated uncertainty.
Proposes task exchangeability as a condition for valid inference when using synthetic data in scientific research, with methods and extensions demonstrated on surveys and AI evaluations.
MDCP constructs conformal prediction intervals for Markov processes with non-asymptotic unconditional coverage bounds under beta-mixing and asymptotic conditional validity using kernel estimators and PIT to iid-ify the data.
C-SymmPI reformulates conditional coverage as miscoverage error over a user-specified function class to deliver near-conditional guarantees under group symmetries and distributional invariance.
A conformal procedure for CoT replaces majority voting with weighted aggregation and calibrates abstention to guarantee low confident-error rates, achieving 90.1% selective accuracy on GSM8K by abstaining on under 5% of cases.
OCULAR applies conformal prediction to semantic perception data for local calibration of dynamics model uncertainty, yielding guaranteed prediction regions without environment-specific calibration data.
PASC converts multi-stage joint coverage into a single scalar conformal problem on the joint max nonconformity score, delivering finite-sample distribution-free guarantees and higher empirical coverage than Bonferroni or independent calibration.
A quantized model exchange framework for decentralized conformal novelty detection preserves conditional exchangeability and delivers finite-sample global FDR control.
A PIT-calibrated percentile interval method delivers finite-sample marginal coverage, asymptotic conditional coverage, and shorter intervals than prior conformal approaches.
An approximate inequality for the probability involving order statistics under near-i.i.d. conditions is established and applied to justify resampling-based statistical procedures.
citing papers explorer
-
Impossibility of Distribution-Free Predictive Inference for Individual Treatment Effects
Distribution-free predictive inference for individual treatment effects is impossible: any valid set must have infinite expected length under standard assumptions with continuous covariates.
-
Testing hypotheses via orthogonalization
A noise-based orthogonalization framework for valid hypothesis testing that extends to post-selection inference under mild assumptions.
-
Full Conformal Prediction under Stochastic Non-Conformity Measure
The relaxation to permutation invariance in distribution is shown to be insufficient for full conformal prediction validity under stochastic non-conformity measures, and Conditional Independence & Permutation Invariance in Distribution is provided as the correct sufficient condition.
-
Conformal Language Modeling via Posterior Sampling
Conformal language modeling samples from posterior approximations conditioned on high-scoring regions to achieve risk control with higher utility than post-hoc filtering in open-ended text generation.
-
Weighted Conformal Clustering
Develops a weighted conformal clustering method that corrects for synthetic labels via conditional distribution shift to achieve finite-sample marginal coverage with explicit bounds for estimated weights.
-
Leave a Window Out: Modifying the Jackknife for Predictive Inference in Time Series
The leave-a-window-out jackknife achieves valid predictive coverage in time series under mild model stability by introducing coefficients that quantify departure from cyclic exchangeability.
-
Online Conformal Prediction: Enforcing monotonicity via Online Optimization
Two novel online conformal prediction algorithms enforce nested prediction sets across coverage levels using online optimization with regret bounds for quantile error control.
-
A Unified Theory of Conditional Coverage in Conformal Prediction with Applications
Non-asymptotic decomposition of conditional miscoverage in conformal prediction into score-estimation error, finite-sample calibration error, and intrinsic conditional-mismatch error, with guidance for model selection and extensions to covariate shift and structured data.
-
When Are Trade-Off Functions Testable from Finite Samples?
Trade-off functions between two distributions are finitely testable if and only if their Neyman-Pearson rejection regions are attainable by a VC-class of sets.
-
Risk-Controlled Post-Processing of Decision Policies
Risk-controlled post-processing yields a threshold-structured policy that follows the baseline except where an oracle fallback sharply reduces conditional violation risk, achieving O(log n/n) expected excess risk in i.i.d. settings and exact risk control under exchangeability.
-
Inference for Clustering: Conformal Sets for Cluster Labels
Split conformal clustering with stochastic labels provides finite-sample marginal coverage guarantees for cluster label confidence sets, controlled by soft-label consistency and replace-one stability of the clustering algorithm.
-
Conformal Risk Control under Non-Monotone Losses: Theory and Finite-Sample Guarantees
Conformal risk control for bounded non-monotone losses over a grid of size m achieves excess risk of order sqrt(log m / n) with n calibration samples, which is minimax optimal.
-
ST-BCP: Tightening Coverage Bound for Backward Conformal Prediction via Non-Conformity Score Transformation
ST-BCP tightens the coverage bound in Backward Conformal Prediction by applying a computable data-dependent transformation to nonconformity scores, reducing the average gap from 4.20% to 1.12% on benchmarks while proving superiority over the identity baseline.
-
PAC-Bayesian Certificates for Quadratic Closed-Loop Control
PAC-Bayesian bounds are derived for quadratic closed-loop control via SLS parameterization, yielding Chernoff certificates for posteriors over responses, a mean-response deployment result, and a data-driven learning algorithm.
-
PRecover 1.0: Process Rate Recovery with Machine Learning
Machine learning models recover most warm-rain and ice microphysical process rates from standard ICON model outputs for accumulation intervals of 10 minutes or less using a two-step classification-regression approach with calibrated uncertainty.
-
Valid Inference with Synthetic Data via Task Exchangeability
Proposes task exchangeability as a condition for valid inference when using synthetic data in scientific research, with methods and extensions demonstrated on surveys and AI evaluations.
-
Distributional Conformal Prediction for Markov Processes
MDCP constructs conformal prediction intervals for Markov processes with non-asymptotic unconditional coverage bounds under beta-mixing and asymptotic conditional validity using kernel estimators and PIT to iid-ify the data.
-
Conditional Predictive Inference for General Structured Data with Group Symmetries
C-SymmPI reformulates conditional coverage as miscoverage error over a user-specified function class to deliver near-conditional guarantees under group symmetries and distributional invariance.
-
Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning
A conformal procedure for CoT replaces majority voting with weighted aggregation and calibrates abstention to guarantee low confident-error rates, achieving 90.1% selective accuracy on GSM8K by abstaining on under 5% of cases.
-
Local Conformal Calibration of Dynamics Uncertainty from Semantic Images
OCULAR applies conformal prediction to semantic perception data for local calibration of dynamics model uncertainty, yielding guaranteed prediction regions without environment-specific calibration data.
-
PASC: Pipeline-Aware Conformal Prediction with Joint Coverage Guarantees for Multi-Stage NLP and LLM Pipelines
PASC converts multi-stage joint coverage into a single scalar conformal problem on the joint max nonconformity score, delivering finite-sample distribution-free guarantees and higher empirical coverage than Bonferroni or independent calibration.
-
Decentralized Conformal Novelty Detection via Quantized Model Exchange
A quantized model exchange framework for decentralized conformal novelty detection preserves conditional exchangeability and delivers finite-sample global FDR control.
-
Conformalized Percentile Interval: Finite Sample Validity and Improved Conditional Performance
A PIT-calibrated percentile interval method delivers finite-sample marginal coverage, asymptotic conditional coverage, and shorter intervals than prior conformal approaches.
-
On a Probability Inequality for Order Statistics with Applications to Bootstrap, Conformal Prediction, and more
An approximate inequality for the probability involving order statistics under near-i.i.d. conditions is established and applied to justify resampling-based statistical procedures.
-
Conformal Inference for Experimental Attrition in Social Science Research
Conformal inference produces robust prediction intervals for treatment effects under experimental attrition, outperforming complete-case, imputation, and weighting approaches in simulations.
-
Adaptive Conformal Prediction for Quantum Machine Learning
The paper proposes AQCP, an algorithm that provides asymptotic average coverage guarantees for quantum conformal prediction under arbitrary hardware noise by repeated recalibration.
-
Feedback-Enhanced Online Multiple Testing with Applications to Conformal Selection
GAIF dynamically adjusts testing thresholds with feedback for finite-sample FDR control in sequential settings and extends to conformal selection via feedback-driven model selection.
-
Estimating the size of a set using cascading exclusion
Develops non-asymptotic estimators for set cardinality via cascading exclusion that bridge classic extremes and apply to convex volumes and unseen species.
-
Reliable Wireless Indoor Localization via Cross-Validated Prediction-Powered Calibration
A cross-validated prediction-powered calibration technique that fine-tunes predictors and estimates synthetic label bias to produce prediction sets with coverage guarantees for wireless indoor localization using scarce calibration data.
-
Tube Loss: A Novel Approach for Prediction Interval Estimation
Tube Loss is a novel loss function enabling simultaneous prediction interval bound estimation with asymptotic coverage guarantees, tunable positioning for skewed distributions, and trade-offs between coverage and width via single optimization.
-
Reliability of Probabilistic Emulation of Physical Systems
CRPS-trained ensembles achieve better uncertainty reliability and speed than latent generative models for probabilistic emulation of 2D physical systems.
-
Conformal Certification of Reasoning Trace Prefixes
CROP applies conformal prediction to certify the longest contiguous prefix of an LLM reasoning trace that is guaranteed to contain no annotated errors under an exchangeability assumption.
-
Conf-Gen: Conformal Uncertainty Quantification for Generative Models
Conf-Gen adapts conformal risk control to generative tasks by relaxing assumptions, unifying prior CP work on LLMs and extending guarantees to image generators, conversational AI, and AI agent correctness.
-
Self-Supervised Conformal Prediction with Equivariant Bootstrapping for Image Uncertainty Quantification
A self-supervised conformal prediction method with equivariant bootstrapping enables uncertainty quantification for ill-posed imaging inverse problems such as weak lensing mass mapping without requiring ground truth calibration data.
-
Inductive Venn-Abers and related regressors
Venn-Abers predictors are extended to unbounded regression via conformal prediction, producing point regressors that modestly improve efficiency over standard methods for large datasets.
-
Conformalized Super Learner
Conformalized super learner builds prediction intervals by weighting conformity scores from base learners via a majority vote, delivering valid coverage for continuous outcomes under exchangeability and heterogeneity.
-
Probably Approximately Correct (PAC) Guarantees for Data-Driven Reachability Analysis: A Theoretical and Empirical Comparison
Formal connections between PAC bounds for three data-driven reachability methods are established, with empirical results showing they are not interchangeable despite similarities.
-
Conformal prediction for uncertainties in the neutron star equation of state
Conformalized quantile regression applied post hoc to neutron star posterior samples yields reliable uncertainty bands validated by empirical coverage studies.
-
Aggregation in conformal e-classification
The paper experimentally studies cross-conformal e-prediction and conceptually simpler modifications for aggregating conformal e-predictors while retaining validity.