StayFair addresses guidance bias in diffusion models by extending demographic parity, allowing fairness to hold across guidance scales via modified classifier or null-embedding steps.
Understanding the Effects of Distractors on Reasoning Vision-Language Models
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
How does irrelevant information (i.e., distractors) affect test-time scaling in vision-language models (VLMs)? Prior work on text-only language models has shown that textual distractors can intensify inverse scaling, causing models to reason longer but less effective reasoning traces. In this work, we investigate whether similar phenomena arise in multimodal settings. We introduce Idis (Images with distractors), a visual question-answering dataset that systematically varies distractors along semantic and numerical dimensions. Our analyses reveal that visual distractors affect reasoning VLMs in a fundamentally different way from textual distractors: although inverse scaling still emerges, visual distractors reduce accuracy without increasing reasoning length. We further show that attribute counts extracted from reasoning traces provide key insights into how distractors interact with reasoning length and accuracy. As a sanity check, we propose a simple prompting strategy that mitigates distractor-driven predictions in reasoning vision-language models.
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Stay Fair! Ensuring Group Fairness in Diffusion Models Across Guidance Scales
StayFair addresses guidance bias in diffusion models by extending demographic parity, allowing fairness to hold across guidance scales via modified classifier or null-embedding steps.