Individually calibrated predictors become collectively miscalibrated under Brier-optimal strategic responses with positive belief correlations, but VCG aggregation restores dominant-strategy incentive compatibility and near-optimal performance.
Verification of forecasts expressed in terms of probability
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
Inducing artificial uncertainty on trivial tasks allows training probes that achieve higher calibration on hard data than standard approaches while retaining performance on easy data.
Non-affine approval functions create unavoidable miscalibration in proper scoring rules for strategic agents, but step-function thresholds enable first-best screening without it, uniquely for the Brier score.
Probabilistic bias correction doubles AI subseasonal forecast skill and wins a 2025 international competition by correcting biases in ECMWF models for pressure, temperature, and precipitation.
A knowledge-data dual paradigm using geomorphic priors and a tabular foundation model achieves baseline-level landslide susceptibility prediction accuracy with only 30% of typical data in tested regions.
Single-seed CRPS estimates in limited-data BDL show high variance and peaks for heteroscedastic methods, with local variance correlating above 0.96 to single-seed error.
SDM activation and estimator extend softmax with similarity and distance awareness for more robust selective classification in pre-trained language models.
Stacked video ensemble model distinguishes BAV from TAV on PLAX cine loops with outer-CV F1 of 0.907 using Grad-CAM and SHAP for explainability.
citing papers explorer
-
When Individually Calibrated Models Become Collectively Miscalibrated
Individually calibrated predictors become collectively miscalibrated under Brier-optimal strategic responses with positive belief correlations, but VCG aggregation restores dominant-strategy incentive compatibility and near-optimal performance.
-
Inducing Artificial Uncertainty in Language Models
Inducing artificial uncertainty on trivial tasks allows training probes that achieve higher calibration on hard data than standard approaches while retaining performance on easy data.
-
The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting
Non-affine approval functions create unavoidable miscalibration in proper scoring rules for strategic agents, but step-function thresholds enable first-best screening without it, uniquely for the Brier score.
-
Enhancing AI and Dynamical Subseasonal Forecasts with Probabilistic Bias Correction
Probabilistic bias correction doubles AI subseasonal forecast skill and wins a 2025 international competition by correcting biases in ECMWF models for pressure, temperature, and precipitation.
-
Knowledge-Data Dually Driven Paradigm for Accurate Landslide Susceptibility Prediction under Data-Scarce Conditions Using Geomorphic Priors and Tabular Foundation Model
A knowledge-data dual paradigm using geomorphic priors and a tabular foundation model achieves baseline-level landslide susceptibility prediction accuracy with only 30% of typical data in tested regions.
-
A Tale of Two Variances: When Single-Seed Benchmarks Fail in Bayesian Deep Learning
Single-seed CRPS estimates in limited-data BDL show high variance and peaks for heteroscedastic methods, with local variance correlating above 0.96 to single-seed error.
-
Similarity-Distance-Magnitude Activations
SDM activation and estimator extend softmax with similarity and distance awareness for more robust selective classification in pre-trained language models.
-
Robust and Explainable Bicuspid Aortic Valve Diagnosis Using Stacked Ensembles on Echocardiography
Stacked video ensemble model distinguishes BAV from TAV on PLAX cine loops with outer-CV F1 of 0.907 using Grad-CAM and SHAP for explainability.