pith. sign in

hub

Why is spatial reasoning hard for vlms? an attention mechanism perspective on focus areas

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

hub tools

citation-role summary

background 1

citation-polarity summary

years

2026 10 2025 1

roles

background 1

polarities

background 1

clear filters

representative citing papers

3D Primitives are a Spatial Language for VLMs

cs.CV · 2026-05-12 · conditional · novelty 7.0

3D geometric primitives in executable code act as an effective intermediate spatial language that boosts VLMs on reconstruction and question-answering tasks.

Self-Improving Small Object Grounding in LVLMs

cs.CV · 2026-06-01 · unverdicted · novelty 6.0

Attention maps in LVLMs enable an IoU regressor (Pearson r > 0.67) and a training-free entropy-based selector that improves small-object localization by up to 19% on COCO and Objects365.

citing papers explorer

Showing 9 of 9 citing papers after filters.