SQI uses axiomatic constraints, hierarchical decomposition, and counterfactual verification to align linguistic reasoning with visual perception in frozen VLMs, achieving second place on the DataCV 2026 illusion challenge.
React: Synergizing reasoning and acting in language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Beyond Shortcuts: Mitigating Visual Illusions in Frozen VLMs via Qualitative Reasoning
SQI uses axiomatic constraints, hierarchical decomposition, and counterfactual verification to align linguistic reasoning with visual perception in frozen VLMs, achieving second place on the DataCV 2026 illusion challenge.