Routing summaries and auxiliary features do not provide stable evidence of conditional miscalibration in AR transformers once confidence-matched baselines, capacity controls, and permutation nulls are applied.
OT2D” entry was a duplicate Nadaraya–Watson estimator on(c,PredEntropy) with Scott’s-rule bandwidth and the “optimal-transport
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Probing Routing-Conditional Calibration in Attention-Residual Transformers
Routing summaries and auxiliary features do not provide stable evidence of conditional miscalibration in AR transformers once confidence-matched baselines, capacity controls, and permutation nulls are applied.