UBD leverages ensemble uncertainty to estimate per-sample memorization and construct debiased targets for post-hoc correction or unlearning, yielding output distributions closer to uncontaminated models on MMLU-Pro and MATH-MCQA than baselines.
arXiv preprint arXiv:2507.10016 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Audio language models are benchmarked on five semantic and paralinguistic reasoning tasks to reveal limitations in handling spoken audio evidence, accent variation, and domain shifts.
citing papers explorer
-
Afrispeech Semantics: Evaluating Audio Semantic Reasoning in Spoken Language Models Across Domains and Accents
Audio language models are benchmarked on five semantic and paralinguistic reasoning tasks to reveal limitations in handling spoken audio evidence, accent variation, and domain shifts.