pith. sign in

Proceedings of the AAAI conference on artificial intelligence , volume=

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 5

roles

background 1

polarities

background 1

representative citing papers

The Minimax Rate of Second-Order Calibration

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

The minimax rate of estimating second-order calibration error is Õ(1/√n) with a matching Ω(1/√n) lower bound, enabled by analyticity from the sech kernel and yielding the first finite-sample guarantee for second-order Platt scaling.

Risk-Controlled Post-Processing of Decision Policies

stat.ML · 2026-05-07 · unverdicted · novelty 7.0

Risk-controlled post-processing yields a threshold-structured policy that follows the baseline except where an oracle fallback sharply reduces conditional violation risk, achieving O(log n/n) expected excess risk in i.i.d. settings and exact risk control under exchangeability.

Reading Calibrated Uncertainty from Language Model Trajectories

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

Geometric features from per-layer MLP update trajectories fed to a sparse linear probe outperform maximum softmax probability for uncertainty quantification under selective abstention, with gains up to 21 AURC points.

citing papers explorer

Showing 5 of 5 citing papers.

  • The Minimax Rate of Second-Order Calibration cs.LG · 2026-05-08 · unverdicted · none · ref 18

    The minimax rate of estimating second-order calibration error is Õ(1/√n) with a matching Ω(1/√n) lower bound, enabled by analyticity from the sech kernel and yielding the first finite-sample guarantee for second-order Platt scaling.

  • Risk-Controlled Post-Processing of Decision Policies stat.ML · 2026-05-07 · unverdicted · none · ref 237

    Risk-controlled post-processing yields a threshold-structured policy that follows the baseline except where an oracle fallback sharply reduces conditional violation risk, achieving O(log n/n) expected excess risk in i.i.d. settings and exact risk control under exchangeability.

  • Reading Calibrated Uncertainty from Language Model Trajectories cs.LG · 2026-05-19 · unverdicted · none · ref 10

    Geometric features from per-layer MLP update trajectories fed to a sparse linear probe outperform maximum softmax probability for uncertainty quantification under selective abstention, with gains up to 21 AURC points.

  • When Evidence Conflicts: Uncertainty and Order Effects in Retrieval-Augmented Biomedical Question Answering cs.CL · 2026-05-13 · conditional · none · ref 7

    Conflicting biomedical evidence triggers order-dependent prediction flips in RAG LLMs, and a new abstention score combining confidence with conflict detection raises selective accuracy by 7-33 points in the hardest conditions.

  • Calibrating Model-Based Evaluation Metrics for Summarization cs.CL · 2026-04-19 · unverdicted · none · ref 144

    A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.