Are vision language models texture or shape biased and can we steer them?

Paul Gavrikov, Jovita Lukasik, Steffen Jung, Robert Geirhos, Bianca Lamm, Muhammad Jehanzeb Mirza, Margret Keuper, Janis Keuper · 2024 · arXiv 2403.09193

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

TraversalBench: Challenging Paths to Follow for Vision Language Models

cs.CV · 2026-04-13 · unverdicted · novelty 7.0

TraversalBench shows self-intersections cause the sharpest performance drops for VLMs on exact path traversal, with errors localized at the first crossing.

AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models

cs.CV · 2025-06-10 · unverdicted · novelty 7.0

AVA-Bench evaluates vision foundation models by disentangling 14 atomic visual abilities with aligned training-test distributions to reveal precise ability fingerprints.

citing papers explorer

Showing 2 of 2 citing papers.

TraversalBench: Challenging Paths to Follow for Vision Language Models cs.CV · 2026-04-13 · unverdicted · none · ref 12
TraversalBench shows self-intersections cause the sharpest performance drops for VLMs on exact path traversal, with errors localized at the first crossing.
AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models cs.CV · 2025-06-10 · unverdicted · none · ref 23
AVA-Bench evaluates vision foundation models by disentangling 14 atomic visual abilities with aligned training-test distributions to reveal precise ability fingerprints.

Are vision language models texture or shape biased and can we steer them?

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer