CounterCount shows VLMs perform well on factual counting images but degrade on counterfactual edits, revealing reliance on object priors, and introduces an attention reweighting method that improves accuracy by up to 8%.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 2years
2026 2roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
CounterCount: A Diagnostic Framework for Counting Bias in Vision Language Models
CounterCount shows VLMs perform well on factual counting images but degrade on counterfactual edits, revealing reliance on object priors, and introduces an attention reweighting method that improves accuracy by up to 8%.
- The Cartesian Shortcut: Re-evaluate Vision Reasoning in Polar Coordinate Space