hub Canonical reference

A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

Anastasios N. Angelopoulos, Stephen Bates · 2021 · cs.LG · arXiv 2107.07511

Canonical reference. 83% of citing Pith papers cite this work as background.

54 Pith papers citing it

Background 83% of classified citations

open full Pith review browse 54 citing papers arXiv PDF

abstract

Black-box machine learning models are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models. Critically, the sets are valid in a distribution-free sense: they possess explicit, non-asymptotic guarantees even without distributional assumptions or model assumptions. One can use conformal prediction with any pre-trained model, such as a neural network, to produce sets that are guaranteed to contain the ground truth with a user-specified probability, such as 90%. It is easy-to-understand, easy-to-use, and general, applying naturally to problems arising in the fields of computer vision, natural language processing, deep reinforcement learning, and so on. This hands-on introduction is aimed to provide the reader a working understanding of conformal prediction and related distribution-free uncertainty quantification techniques with one self-contained document. We lead the reader through practical theory for and examples of conformal prediction and describe its extensions to complex machine learning tasks involving structured outputs, distribution shift, time-series, outliers, models that abstain, and more. Throughout, there are many explanatory illustrations, examples, and code samples in Python. With each code sample comes a Jupyter notebook implementing the method on a real-data example; the notebooks can be accessed and easily run using our codebase.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 5 method 1

citation-polarity summary

background 5 use method 1

claims ledger

abstract Black-box machine learning models are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models. Critically, the sets are valid in a distribution-free sense: they possess explicit, non-asymptotic guarantees even without distributional assumptions or model assumptions. One can use conformal prediction with any pre-trained model, such as a neural network, to produ

co-cited works

representative citing papers

An Optimal Sauer Lemma Over $k$-ary Alphabets

cs.LG · 2026-04-14 · unverdicted · novelty 8.0

A sharp Sauer inequality for multiclass and list prediction is established in terms of the DS dimension, tight for every alphabet size k, list size ℓ, and dimension value.

Adaptive Stopping for Multi-Turn LLM Reasoning

cs.CL · 2026-04-01 · unverdicted · novelty 8.0

MiCP is the first conformal prediction method for multi-turn LLM pipelines that allocates per-turn error budgets to enable adaptive stopping with an overall coverage guarantee, shown to reduce turns and cost on RAG and ReAct benchmarks.

Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis

stat.ME · 2026-05-20 · unverdicted · novelty 7.0

Proposes a scale-calibrated median-of-means estimator for robust aggregation of distributed PCA estimates on the product of Euclidean space and Grassmann manifold.

Conformal Prediction via Transported Beta Laws

stat.ML · 2026-05-18 · unverdicted · novelty 7.0

The paper derives that calibration-conditional coverage follows a Beta(k, n+1-k) law under continuous i.i.d. exchangeability and quantifies non-i.i.d. departures via Wasserstein distances on transported beta laws, yielding explicit bounds in scale-shift, clustered, and mixing regimes.

GRAPHLCP: Structure-Aware Localized Conformal Prediction on Graphs

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

GRAPHLCP improves localized conformal prediction on graphs by using feature-aware densification and Personalized PageRank kernels to incorporate topology for better coverage and efficiency.

TRACE: Transport Alignment Conformal Prediction via Diffusion and Flow Matching Models

stat.ML · 2026-05-08 · unverdicted · novelty 7.0

TRACE creates valid conformal prediction sets for complex generative models by scoring outputs via averaged denoising or velocity errors along stochastic transport paths instead of likelihoods.

When Does Trimming Help Conformal Prediction? A Retained-Law Diagnostic under Calibration Contamination

stat.ML · 2026-05-07 · unverdicted · novelty 7.0

Trimming helps conformal prediction under contamination precisely when the anomaly score separates retention probabilities without biasing clean scores, otherwise the retained mixture coefficient prevents substantial decontamination.

In-Context Positive-Unlabeled Learning

stat.ML · 2026-05-07 · unverdicted · novelty 7.0

PUICL is a transformer pretrained on synthetic PU data from structural causal models that solves positive-unlabeled classification via in-context learning without gradient updates or fitting.

Delving into Non-Exchangeability for Conformal Prediction in Graph-Structured Multivariate Time Series

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

SCALE uses Spectral Graph Conditional Exchangeability (SGCE) and graph wavelets to achieve valid coverage and improved efficiency in conformal prediction for non-exchangeable graph time series by conformalizing high-frequency residuals conditioned on low-frequency embeddings.

SURE-RAG: Sufficiency and Uncertainty-Aware Evidence Verification for Selective Retrieval-Augmented Generation

cs.CL · 2026-05-05 · unverdicted · novelty 7.0

SURE-RAG aggregates pair-level claim-evidence relations into interpretable signals for selective RAG answering, reaching 0.9075 Macro-F1 on HotpotQA-RAG v3 while providing auditability and reducing unsafe answers by 37% at 30% coverage.

Intrinsic effective sample size for manifold-valued Markov chain Monte Carlo via kernel discrepancy

stat.ML · 2026-05-05 · unverdicted · novelty 7.0

An intrinsic effective sample size for manifold MCMC is defined via kernel discrepancy as the number of independent draws yielding equivalent expected squared discrepancy to the target.

Profile Likelihood Inference for Anisotropic Hyperbolic Wrapped Normal Models on Hyperbolic Space

math.ST · 2026-05-01 · unverdicted · novelty 7.0

The profile maximum likelihood estimator for the location in anisotropic hyperbolic wrapped normal models is strongly consistent, asymptotically normal, and attains the Hájek-Le Cam minimax lower bound under squared geodesic loss.

Query-Efficient Quantum Approximate Optimization via Graph-Conditioned Trust Regions

cs.LG · 2026-04-27 · unverdicted · novelty 7.0

A GNN predicts Gaussians over QAOA parameters to create graph-conditioned trust regions that reduce circuit evaluations for MaxCut from 85-343 down to 45 while keeping approximation ratios within 3 points of heuristics.

Adaptive Conformal Anomaly Detection with Time Series Foundation Models for Signal Monitoring

cs.LG · 2026-04-22 · unverdicted · novelty 7.0

A model-agnostic adaptive conformal anomaly detection approach uses weighted quantile bounds learned from past foundation model predictions to deliver interpretable p-value scores with stable calibration under shifts for time series monitoring.

Causal inference for social network formation

econ.EM · 2026-04-20 · conditional · novelty 7.0

Random team assignments in a professional firm reveal that indirect ties strongly increase new direct tie formation, while effects of degree and local density are smaller and less robust.

Answer Only as Precisely as Justified: Calibrated Claim-Level Specificity Control for Agentic Systems

cs.CL · 2026-04-19 · unverdicted · novelty 7.0 · 2 refs

Compositional selective specificity (CSS) decomposes generated answers into claims and emits each at the most specific level supported by evidence, raising overcommitment-aware utility from 0.846 to 0.913 on LongFact while retaining 0.938 specificity.

Diagnosing LLM Judge Reliability: Conformal Prediction Sets and Transitivity Violations

cs.AI · 2026-04-16 · unverdicted · novelty 7.0

LLM judges display per-document transitivity violations in 33-67% of cases despite low aggregate rates, while conformal prediction set widths serve as reliable indicators of document-level difficulty with cross-judge agreement.

Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise

cs.LG · 2026-04-07 · unverdicted · novelty 7.0

CMRM adds a conformal quantile regularization on prediction margins to any loss, improving noisy-label classification accuracy up to 3.39% across methods and benchmarks while preserving performance at zero noise.

Conformal Risk Control under Non-Monotone Losses: Theory and Finite-Sample Guarantees

stat.ML · 2026-04-02 · unverdicted · novelty 7.0

Conformal risk control for bounded non-monotone losses over a grid of size m achieves excess risk of order sqrt(log m / n) with n calibration samples, which is minimax optimal.

Post-Selection Distributional Model Evaluation

stat.ML · 2026-03-24 · unverdicted · novelty 7.0

PS-DME is a new framework that controls post-selection false coverage rate for distributional KPI estimates via e-values and is provably more sample-efficient than data splitting under explicit conditions.

From Plausibility to Verifiability: Risk-Controlled Generative OCR with Vision-Language Models

cs.CV · 2026-03-20 · unverdicted · novelty 7.0

A model-agnostic Geometric Risk Controller reduces extreme errors in VLM-based OCR by requiring cross-view consensus before accepting outputs.

Safe Planning in Interactive Environments via Iterative Policy Updates and Adversarially Robust Conformal Prediction

eess.SY · 2025-11-13 · conditional · novelty 7.0

The work develops an iterative safe planner that adjusts conformal prediction bounds across policy updates via sensitivity analysis to maintain distribution-free safety guarantees despite interaction-induced distribution shifts.

MARGIN: Runtime Confidence Calibration for Multi-Agent Foundation Model Coordination

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

MARGIN is an online calibration technique using symmetric EWMA and Bayesian shrinkage that learns per-agent per-band factors from the task stream, cutting calibration error 3-6x versus design-time baselines and lifting multi-agent resolution from 45-56% to 70-89%.

BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation

cs.CL · 2026-05-19 · unverdicted · novelty 6.0

BalanceRAG uses sequential graphical testing on a 2D lattice of threshold pairs to certify safe operating points that meet target risk levels in cascaded RAG while increasing coverage.

citing papers explorer

Showing 50 of 54 citing papers.

An Optimal Sauer Lemma Over $k$-ary Alphabets cs.LG · 2026-04-14 · unverdicted · none · ref 1 · internal anchor
A sharp Sauer inequality for multiclass and list prediction is established in terms of the DS dimension, tight for every alphabet size k, list size ℓ, and dimension value.
Adaptive Stopping for Multi-Turn LLM Reasoning cs.CL · 2026-04-01 · unverdicted · none · ref 2 · internal anchor
MiCP is the first conformal prediction method for multi-turn LLM pipelines that allocates per-turn error budgets to enable adaptive stopping with an overall coverage guarantee, shown to reduce turns and cost on RAG and ReAct benchmarks.
Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis stat.ME · 2026-05-20 · unverdicted · none · ref 252 · internal anchor
Proposes a scale-calibrated median-of-means estimator for robust aggregation of distributed PCA estimates on the product of Euclidean space and Grassmann manifold.
Conformal Prediction via Transported Beta Laws stat.ML · 2026-05-18 · unverdicted · none · ref 20 · internal anchor
The paper derives that calibration-conditional coverage follows a Beta(k, n+1-k) law under continuous i.i.d. exchangeability and quantifies non-i.i.d. departures via Wasserstein distances on transported beta laws, yielding explicit bounds in scale-shift, clustered, and mixing regimes.
GRAPHLCP: Structure-Aware Localized Conformal Prediction on Graphs cs.LG · 2026-05-08 · unverdicted · none · ref 1 · internal anchor
GRAPHLCP improves localized conformal prediction on graphs by using feature-aware densification and Personalized PageRank kernels to incorporate topology for better coverage and efficiency.
TRACE: Transport Alignment Conformal Prediction via Diffusion and Flow Matching Models stat.ML · 2026-05-08 · unverdicted · none · ref 43 · internal anchor
TRACE creates valid conformal prediction sets for complex generative models by scoring outputs via averaged denoising or velocity errors along stochastic transport paths instead of likelihoods.
When Does Trimming Help Conformal Prediction? A Retained-Law Diagnostic under Calibration Contamination stat.ML · 2026-05-07 · unverdicted · none · ref 5 · internal anchor
Trimming helps conformal prediction under contamination precisely when the anomaly score separates retention probabilities without biasing clean scores, otherwise the retained mixture coefficient prevents substantial decontamination.
In-Context Positive-Unlabeled Learning stat.ML · 2026-05-07 · unverdicted · none · ref 20 · internal anchor
PUICL is a transformer pretrained on synthetic PU data from structural causal models that solves positive-unlabeled classification via in-context learning without gradient updates or fitting.
Delving into Non-Exchangeability for Conformal Prediction in Graph-Structured Multivariate Time Series cs.LG · 2026-05-06 · unverdicted · none · ref 32 · internal anchor
SCALE uses Spectral Graph Conditional Exchangeability (SGCE) and graph wavelets to achieve valid coverage and improved efficiency in conformal prediction for non-exchangeable graph time series by conformalizing high-frequency residuals conditioned on low-frequency embeddings.
SURE-RAG: Sufficiency and Uncertainty-Aware Evidence Verification for Selective Retrieval-Augmented Generation cs.CL · 2026-05-05 · unverdicted · none · ref 18 · internal anchor
SURE-RAG aggregates pair-level claim-evidence relations into interpretable signals for selective RAG answering, reaching 0.9075 Macro-F1 on HotpotQA-RAG v3 while providing auditability and reducing unsafe answers by 37% at 30% coverage.
Intrinsic effective sample size for manifold-valued Markov chain Monte Carlo via kernel discrepancy stat.ML · 2026-05-05 · unverdicted · none · ref 194 · internal anchor
An intrinsic effective sample size for manifold MCMC is defined via kernel discrepancy as the number of independent draws yielding equivalent expected squared discrepancy to the target.
Profile Likelihood Inference for Anisotropic Hyperbolic Wrapped Normal Models on Hyperbolic Space math.ST · 2026-05-01 · unverdicted · none · ref 184 · internal anchor
The profile maximum likelihood estimator for the location in anisotropic hyperbolic wrapped normal models is strongly consistent, asymptotically normal, and attains the Hájek-Le Cam minimax lower bound under squared geodesic loss.
Query-Efficient Quantum Approximate Optimization via Graph-Conditioned Trust Regions cs.LG · 2026-04-27 · unverdicted · none · ref 80 · internal anchor
A GNN predicts Gaussians over QAOA parameters to create graph-conditioned trust regions that reduce circuit evaluations for MaxCut from 85-343 down to 45 while keeping approximation ratios within 3 points of heuristics.
Adaptive Conformal Anomaly Detection with Time Series Foundation Models for Signal Monitoring cs.LG · 2026-04-22 · unverdicted · none · ref 2 · internal anchor
A model-agnostic adaptive conformal anomaly detection approach uses weighted quantile bounds learned from past foundation model predictions to deliver interpretable p-value scores with stable calibration under shifts for time series monitoring.
Causal inference for social network formation econ.EM · 2026-04-20 · conditional · none · ref 102 · internal anchor
Random team assignments in a professional firm reveal that indirect ties strongly increase new direct tie formation, while effects of degree and local density are smaller and less robust.
Answer Only as Precisely as Justified: Calibrated Claim-Level Specificity Control for Agentic Systems cs.CL · 2026-04-19 · unverdicted · none · ref 1 · 2 links · internal anchor
Compositional selective specificity (CSS) decomposes generated answers into claims and emits each at the most specific level supported by evidence, raising overcommitment-aware utility from 0.846 to 0.913 on LongFact while retaining 0.938 specificity.
Diagnosing LLM Judge Reliability: Conformal Prediction Sets and Transitivity Violations cs.AI · 2026-04-16 · unverdicted · none · ref 3 · internal anchor
LLM judges display per-document transitivity violations in 33-67% of cases despite low aggregate rates, while conformal prediction set widths serve as reliable indicators of document-level difficulty with cross-judge agreement.
Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise cs.LG · 2026-04-07 · unverdicted · none · ref 1 · internal anchor
CMRM adds a conformal quantile regularization on prediction margins to any loss, improving noisy-label classification accuracy up to 3.39% across methods and benchmarks while preserving performance at zero noise.
Conformal Risk Control under Non-Monotone Losses: Theory and Finite-Sample Guarantees stat.ML · 2026-04-02 · unverdicted · none · ref 2 · internal anchor
Conformal risk control for bounded non-monotone losses over a grid of size m achieves excess risk of order sqrt(log m / n) with n calibration samples, which is minimax optimal.
Post-Selection Distributional Model Evaluation stat.ML · 2026-03-24 · unverdicted · none · ref 11 · internal anchor
PS-DME is a new framework that controls post-selection false coverage rate for distributional KPI estimates via e-values and is provably more sample-efficient than data splitting under explicit conditions.
From Plausibility to Verifiability: Risk-Controlled Generative OCR with Vision-Language Models cs.CV · 2026-03-20 · unverdicted · none · ref 4 · internal anchor
A model-agnostic Geometric Risk Controller reduces extreme errors in VLM-based OCR by requiring cross-view consensus before accepting outputs.
Safe Planning in Interactive Environments via Iterative Policy Updates and Adversarially Robust Conformal Prediction eess.SY · 2025-11-13 · conditional · none · ref 8 · internal anchor
The work develops an iterative safe planner that adjusts conformal prediction bounds across policy updates via sensitivity analysis to maintain distribution-free safety guarantees despite interaction-induced distribution shifts.
MARGIN: Runtime Confidence Calibration for Multi-Agent Foundation Model Coordination cs.LG · 2026-05-21 · unverdicted · none · ref 1 · internal anchor
MARGIN is an online calibration technique using symmetric EWMA and Bayesian shrinkage that learns per-agent per-band factors from the task stream, cutting calibration error 3-6x versus design-time baselines and lifting multi-agent resolution from 45-56% to 70-89%.
BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation cs.CL · 2026-05-19 · unverdicted · none · ref 44 · internal anchor
BalanceRAG uses sequential graphical testing on a 2D lattice of threshold pairs to certify safe operating points that meet target risk levels in cascaded RAG while increasing coverage.
Conditional Predictive Inference for General Structured Data with Group Symmetries stat.ME · 2026-05-18 · unverdicted · none · ref 71 · internal anchor
C-SymmPI reformulates conditional coverage as miscoverage error over a user-specified function class to deliver near-conditional guarantees under group symmetries and distributional invariance.
Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection cs.AI · 2026-05-13 · unverdicted · none · ref 46 · internal anchor
MSIFR stops faulty LLM generations early via staged rule-based checks, reducing token consumption 11-78% with no accuracy loss.
Multi-Fidelity Quantile Regression stat.ME · 2026-05-11 · unverdicted · none · ref 45 · internal anchor
A model-agnostic two-stage estimator links high-fidelity quantiles to low-fidelity ones via a covariate-dependent level function for faster convergence and better accuracy with limited high-fidelity data.
CONTRA: Conformal Prediction Region via Normalizing Flow Transformation stat.ML · 2026-05-08 · unverdicted · none · ref 11 · internal anchor
CONTRA generates sharp multi-dimensional conformal prediction regions by defining nonconformity scores as distances from the center in the latent space of a normalizing flow.
Scale selection for geometric medians on product manifolds math.ST · 2026-05-08 · unverdicted · none · ref 232 · internal anchor
Joint location-scale minimization for geometric medians on product manifolds degenerates to marginal medians, and three new scale-selection methods restore identifiability with asymptotic guarantees.
Conformal Agent Error Attribution cs.LG · 2026-05-07 · unverdicted · none · ref 1 · internal anchor
A new filtration-based conformal prediction method attributes errors in multi-agent systems by producing contiguous sequence sets with finite-sample coverage guarantees, enabling rollback recovery.
Networked Information Aggregation for Binary Classification cs.LG · 2026-05-01 · unverdicted · none · ref 2 · internal anchor
Sequential prediction passing on DAGs for logistic regression yields O(M/sqrt(D)) excess loss when M-agent windows cover all features, with Omega(k/D) lower bound identifying depth as the fundamental limit.
Unsupervised Confidence Calibration for Reasoning LLMs from a Single Generation cs.LG · 2026-04-21 · unverdicted · none · ref 24 · internal anchor
Unsupervised single-generation confidence calibration for reasoning LLMs via offline self-consistency proxy distillation outperforms baselines on math and QA tasks and improves selective prediction.
DAG-STL: A Hierarchical Framework for Zero-Shot Trajectory Planning under Signal Temporal Logic Specifications cs.RO · 2026-04-20 · unverdicted · none · ref 94 · internal anchor
DAG-STL decomposes long-horizon STL planning into decomposition, timed waypoint allocation, and diffusion-based trajectory generation to enable zero-shot planning under unknown dynamics.
Blind-Spot Mass: A Good-Turing Framework for Quantifying Deployment Coverage Risk in Machine Learning Systems cs.LG · 2026-04-06 · unverdicted · none · ref 1 · internal anchor
Blind-spot mass uses Good-Turing unseen-species estimation to measure the total probability of states with low empirical support, showing that 95% of operational mass lies in blind spots at tau=5 across wearable activity recognition and clinical admission data.
Physics-Guided Tiny-Mamba Transformer for Reliability-Aware Early Fault Warning cs.LG · 2026-01-29 · unverdicted · none · ref 41 · internal anchor
PG-TMT couples a physics-aligned tri-branch encoder with EVT-calibrated decision rules to achieve higher PR-AUC and shorter detection times at controlled false-alarm rates across multiple bearing datasets.
Selective Conformal Risk Control cs.LG · 2025-12-14 · conditional · none · ref 2 · internal anchor
Selective Conformal Risk Control combines selective classification with conformal risk control to produce compact prediction sets that meet target coverage and risk levels.
Self-Supervised Conformal Prediction with Equivariant Bootstrapping for Image Uncertainty Quantification stat.ME · 2026-05-18 · unverdicted · none · ref 3 · internal anchor
A self-supervised conformal prediction method with equivariant bootstrapping enables uncertainty quantification for ill-posed imaging inverse problems such as weak lensing mass mapping without requiring ground truth calibration data.
t-gems: text-guided exit modules for decreasing clip image encoder cs.LG · 2026-05-17 · unverdicted · none · ref 29 · internal anchor
Proposes T-GEMs plus a rate-based regularizer for early exits in CLIP encoders guided by text semantics to lower encoder usage costs.
Learning Context-conditioned Gaussian Overbounds for Convolution-Based Uncertainty Propagation cs.LG · 2026-05-15 · unverdicted · none · ref 8 · internal anchor
A learning framework trains neural networks to output context-conditioned Gaussian overbounds with provable conservatism on quantile grids for convolution-based uncertainty propagation.
Adaptive Conformal Prediction for Reliable and Explainable Medical Image Classification cs.CV · 2026-05-13 · unverdicted · none · ref 3 · internal anchor
An adaptive lambda criterion for RAPS achieves 95.72% global coverage and at least 90% coverage across all difficulty strata on medical image datasets while keeping average prediction set size at 1.09.
UCCI: Calibrated Uncertainty for Cost-Optimal LLM Cascade Routing cs.LG · 2026-05-11 · unverdicted · none · ref 25 · internal anchor
UCCI calibrates LLM uncertainty to error probabilities with isotonic regression for cost-optimal cascade routing, delivering 31% cost savings at maintained accuracy on a 75k-query NER task.
Quantile-Free Uncertainty Quantification in Graph Neural Networks cs.LG · 2026-05-06 · unverdicted · none · ref 36 · internal anchor
QpiGNN provides a quantile-free dual-head architecture for GNN uncertainty quantification that directly optimizes coverage and interval width, yielding 22% higher coverage and 50% narrower intervals than baselines on 19 benchmarks with asymptotic coverage guarantees under mild assumptions.
Towards Dependable Retrieval-Augmented Generation Using Factual Confidence Prediction cs.IR · 2026-05-04 · unverdicted · none · ref 2 · internal anchor
A conformal prediction filter for retrieval chunks plus an attention-based factuality classifier can raise RAG answer quality by up to 6% and detect inconsistent generations up to 77% of the time.
An empirical evaluation of the risks of AI model updates using clinical data: stability, arbitrariness, and fairness cs.AI · 2026-04-27 · unverdicted · none · ref 34 · internal anchor
Updating clinical AI models can cause prediction flips, arbitrariness, and unfair error rates across groups, requiring dedicated monitoring dimensions.
ReconVLA: An Uncertainty-Guided and Failure-Aware Vision-Language-Action Framework for Robotic Control cs.RO · 2026-04-17 · unverdicted · none · ref 30 · internal anchor
ReconVLA enhances pretrained vision-language-action robotic policies with conformal prediction for uncertainty estimation and failure detection without retraining.
Uncertainty-Aware Transformers: Conformal Prediction for Language Models cs.LG · 2026-04-10 · unverdicted · none · ref 1 · internal anchor
CONFIDE applies conformal prediction to transformer embeddings for valid prediction sets, improving accuracy up to 4.09% and efficiency over baselines on models like BERT-tiny.
Probably Approximately Correct (PAC) Guarantees for Data-Driven Reachability Analysis: A Theoretical and Empirical Comparison eess.SY · 2026-04-03 · conditional · none · ref 11 · internal anchor
Formal connections between PAC bounds for three data-driven reachability methods are established, with empirical results showing they are not interchangeable despite similarities.
Neural posterior estimation for scalable and accurate inverse parameter inference in Li-ion batteries physics.data-an · 2026-04-02 · unverdicted · none · ref 44 · internal anchor
NPE delivers millisecond-scale parameter inference for Li-ion batteries that matches or exceeds Bayesian calibration accuracy while adding local sensitivity interpretability, though with higher voltage prediction errors.
AIVV: Neuro-Symbolic LLM Agent-Integrated Verification and Validation for Trustworthy Autonomous Systems cs.AI · 2026-04-02 · unverdicted · none · ref 1 · internal anchor
AIVV deploys LLM agents in a council to semantically validate anomalies in time-series data against natural-language requirements, automating human-in-the-loop verification for autonomous systems.
Uncertainty-Calibrated Explainable Artificial Intelligence for Fetal Ultrasound Plane Classification: A Systematic Review eess.IV · 2026-01-02 · unverdicted · none · ref 1 · internal anchor
PRISMA 2020 systematic review of 78 studies on fetal ultrasound plane classification paired with explainability or uncertainty, introducing the CALIB-XFUS reporting framework across six domains.

A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer