High accuracy in noisy-label learning does not guarantee OOD detection reliability due to uncertainty collapse, and Virtual Margin Regularization offers partial mitigation.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
NeuroState-Bench supplies human-calibrated tasks and probes that measure commitment integrity in LLM agents and shows this measure diverges from ordinary task success.
citing papers explorer
-
When Accuracy Is Not Enough: Uncertainty Collapse between Noisy Label Learning and Out-of-Distribution Detection
High accuracy in noisy-label learning does not guarantee OOD detection reliability due to uncertainty collapse, and Virtual Margin Regularization offers partial mitigation.
-
NeuroState-Bench: A Human-Calibrated Benchmark for Commitment Integrity in LLM Agent Profiles
NeuroState-Bench supplies human-calibrated tasks and probes that measure commitment integrity in LLM agents and shows this measure diverges from ordinary task success.