Risk-controlled post-processing yields a threshold-structured policy that follows the baseline except where an oracle fallback sharply reduces conditional violation risk, achieving O(log n/n) expected excess risk in i.i.d. settings and exact risk control under exchangeability.
International Conference on Machine Learning , pages=
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Response entropy in LLMs rises with missing context on SQuAD while sampling-based confidence stays high, supporting the multiple imputation criterion and introducing a diagnostic for uncertainty reduction by context level.
An oracle-optimal decision policy for informative conformal prediction sets is calibrated to ensure finite-sample FCR control and delivers higher power than prior methods on classification tasks.
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.
citing papers explorer
-
Risk-Controlled Post-Processing of Decision Policies
Risk-controlled post-processing yields a threshold-structured policy that follows the baseline except where an oracle fallback sharply reduces conditional violation risk, achieving O(log n/n) expected excess risk in i.i.d. settings and exact risk control under exchangeability.
-
LLMs as Implicit Imputers: Uncertainty Should Scale with Missing Information
Response entropy in LLMs rises with missing context on SQuAD while sampling-based confidence stays high, supporting the multiple imputation criterion and introducing a diagnostic for uncertainty reduction by context level.
-
Selecting Informative Conformal Prediction Sets with an Optimized FCR-Controlled Approach
An oracle-optimal decision policy for informative conformal prediction sets is calibrated to ensure finite-sample FCR control and delivers higher power than prior methods on classification tasks.
-
Calibrating Model-Based Evaluation Metrics for Summarization
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.