hub

Advances in neural information processing systems , volume=

Selective classification for deep neural networks , author=

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

browse 10 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

method 2

citation-polarity summary

use method 2

representative citing papers

Learning When to Stop: Selective Imitation Learning Under Arbitrary Dynamics Shift

cs.LG · 2026-05-09 · unverdicted · novelty 7.0 · 2 refs

SeqRejectron constructs a stopping rule with a small set of validator policies to achieve horizon-free sample complexity for selective imitation learning under arbitrary dynamics shifts.

A Regime Theory of Controller Class Selection for LLM Action Decisions

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

A regime theory selects the optimal controller class for LLM action decisions from a nested lattice of four classes using three data-estimable bottlenecks, with a Bernstein-tight threshold and empirical matches on multiple benchmarks.

Structural Verification for Reliable EDA Code Generation without Tool-in-the-Loop Debugging

cs.SE · 2026-04-20 · unverdicted · novelty 7.0

Structural dependency graphs and staged pre-execution verification raise LLM-based EDA code pass rates to 82.5% (single-step) and 70-84% (multi-step) while halving tool calls by catching dependency violations before runtime.

Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference

stat.ME · 2026-05-20 · unverdicted · novelty 6.0

Derives simultaneous finite-sample distribution-free upper bounds on false discovery proportions for conformal p-values that hold for every possible rejection threshold.

BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation

cs.CL · 2026-05-19 · unverdicted · novelty 6.0

BalanceRAG uses sequential graphical testing on a 2D lattice of threshold pairs to certify safe operating points that meet target risk levels in cascaded RAG while increasing coverage.

Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution

cs.CL · 2026-05-19 · unverdicted · novelty 6.0

SCA framework applies Information Bottleneck to assign step-level confidence in black-box LLM reasoning traces, flagging errors and boosting self-correction success by up to 13.5% on math and QA tasks.

When Evidence Conflicts: Uncertainty and Order Effects in Retrieval-Augmented Biomedical Question Answering

cs.CL · 2026-05-13 · conditional · novelty 6.0

Conflicting biomedical evidence triggers order-dependent prediction flips in RAG LLMs, and a new abstention score combining confidence with conflict detection raises selective accuracy by 7-33 points in the hardest conditions.

Post-hoc Selective Classification for Reliable Synthetic Image Detection

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

ReSIDe generalizes logit-based confidence scores to intermediate layers of synthetic image detectors and uses preference optimization to aggregate them, cutting area under the risk-coverage curve by up to 69.55% under covariate shifts.

CHASE: Competing Hypotheses for Ambiguity-Aware Selective Prediction

cs.CV · 2026-05-02 · unverdicted · novelty 6.0

CHASE improves selective prediction under ambiguity by optimizing a ranking-aware selector over margins between competing temporal hypotheses, yielding up to 11% better alignment and 8.8% higher three-way accuracy than baselines on GUV-inspired tasks.

Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering

cs.CL · 2026-05-19 · unverdicted · novelty 5.0

Mainstream UQ for LLMs reduces to unsupervised clustering of internal generation consistency and therefore cannot detect confident hallucinations or provide reliable safety signals.

citing papers explorer

Showing 10 of 10 citing papers.

Learning When to Stop: Selective Imitation Learning Under Arbitrary Dynamics Shift cs.LG · 2026-05-09 · unverdicted · none · ref 12 · 2 links
SeqRejectron constructs a stopping rule with a small set of validator policies to achieve horizon-free sample complexity for selective imitation learning under arbitrary dynamics shifts.
A Regime Theory of Controller Class Selection for LLM Action Decisions cs.AI · 2026-05-07 · unverdicted · none · ref 3
A regime theory selects the optimal controller class for LLM action decisions from a nested lattice of four classes using three data-estimable bottlenecks, with a Bernstein-tight threshold and empirical matches on multiple benchmarks.
Structural Verification for Reliable EDA Code Generation without Tool-in-the-Loop Debugging cs.SE · 2026-04-20 · unverdicted · none · ref 30
Structural dependency graphs and staged pre-execution verification raise LLM-based EDA code pass rates to 82.5% (single-step) and 70-84% (multi-step) while halving tool calls by catching dependency violations before runtime.
Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference stat.ME · 2026-05-20 · unverdicted · none · ref 8
Derives simultaneous finite-sample distribution-free upper bounds on false discovery proportions for conformal p-values that hold for every possible rejection threshold.
BalanceRAG: Joint Risk Calibration for Cascaded Retrieval-Augmented Generation cs.CL · 2026-05-19 · unverdicted · none · ref 6
BalanceRAG uses sequential graphical testing on a 2D lattice of threshold pairs to certify safe operating points that meet target risk levels in cascaded RAG while increasing coverage.
Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution cs.CL · 2026-05-19 · unverdicted · none · ref 56
SCA framework applies Information Bottleneck to assign step-level confidence in black-box LLM reasoning traces, flagging errors and boosting self-correction success by up to 13.5% on math and QA tasks.
When Evidence Conflicts: Uncertainty and Order Effects in Retrieval-Augmented Biomedical Question Answering cs.CL · 2026-05-13 · conditional · none · ref 8
Conflicting biomedical evidence triggers order-dependent prediction flips in RAG LLMs, and a new abstention score combining confidence with conflict detection raises selective accuracy by 7-33 points in the hardest conditions.
Post-hoc Selective Classification for Reliable Synthetic Image Detection cs.CV · 2026-05-09 · unverdicted · none · ref 11
ReSIDe generalizes logit-based confidence scores to intermediate layers of synthetic image detectors and uses preference optimization to aggregate them, cutting area under the risk-coverage curve by up to 69.55% under covariate shifts.
CHASE: Competing Hypotheses for Ambiguity-Aware Selective Prediction cs.CV · 2026-05-02 · unverdicted · none · ref 1
CHASE improves selective prediction under ambiguity by optimizing a ranking-aware selector over margins between competing temporal hypotheses, yielding up to 11% better alignment and 8.8% higher three-way accuracy than baselines on GUV-inspired tasks.
Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering cs.CL · 2026-05-19 · unverdicted · none · ref 32
Mainstream UQ for LLMs reduces to unsupervised clustering of internal generation consistency and therefore cannot detect confident hallucinations or provide reliable safety signals.

Advances in neural information processing systems , volume=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer