pith. sign in

Euclid’s gift: En- hancing spatial perception and reasoning in vision-language models via geometric surrogate tasks

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

baseline 1

citation-polarity summary

fields

cs.CV 2 cs.AI 1

years

2026 3

roles

baseline 1

polarities

baseline 1

clear filters

representative citing papers

Why MLLMs Struggle to Determine Object Orientations

cs.CV · 2026-04-14 · accept · novelty 7.0

Orientation information is recoverable from MLLM visual encoder embeddings via linear regression, contradicting the hypothesis that failures originate in the encoders.

RoboPIN: Grounded Embodied Reasoning via Pinned Chain-of-Thought

cs.AI · 2026-06-14 · unverdicted · novelty 6.0

Introduces PinCoT paradigm with visual reasoning anchors, builds PIN-170K dataset via automated pipeline, and trains 4B RoboPIN model via three-stage post-training to outperform 7B baselines by 12% on embodied reasoning benchmarks.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • Why MLLMs Struggle to Determine Object Orientations cs.CV · 2026-04-14 · accept · none · ref 16

    Orientation information is recoverable from MLLM visual encoder embeddings via linear regression, contradicting the hypothesis that failures originate in the encoders.

  • RoboPIN: Grounded Embodied Reasoning via Pinned Chain-of-Thought cs.AI · 2026-06-14 · unverdicted · none · ref 9

    Introduces PinCoT paradigm with visual reasoning anchors, builds PIN-170K dataset via automated pipeline, and trains 4B RoboPIN model via three-stage post-training to outperform 7B baselines by 12% on embodied reasoning benchmarks.

  • Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs cs.CV · 2026-05-01 · unverdicted · none · ref 42 · 2 links

    PVM adds a parallel branch to LVLMs that directly supplies visual embeddings to prevent attention decay over long generated sequences, yielding accuracy gains on reasoning tasks with minimal overhead.