pith. sign in

Cohen, Taylor W

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.CV 2

years

2026 1 2025 1

roles

background 1

polarities

background 1

representative citing papers

Do multimodal models imagine electric sheep?

cs.CV · 2026-05-10 · conditional · novelty 6.0

Fine-tuning VLMs to output action sequences for puzzles causes emergent internal visual representations that improve performance when integrated into reasoning.

Grounded Reinforcement Learning for Visual Reasoning

cs.CV · 2025-05-29 · unverdicted · novelty 6.0

ViGoRL introduces visually grounded RL that anchors reasoning steps to image coordinates and uses multi-turn zooming to outperform standard RL and supervised baselines on spatial and GUI reasoning benchmarks.

citing papers explorer

Showing 2 of 2 citing papers.

  • Do multimodal models imagine electric sheep? cs.CV · 2026-05-10 · conditional · none · ref 22

    Fine-tuning VLMs to output action sequences for puzzles causes emergent internal visual representations that improve performance when integrated into reasoning.

  • Grounded Reinforcement Learning for Visual Reasoning cs.CV · 2025-05-29 · unverdicted · none · ref 5

    ViGoRL introduces visually grounded RL that anchors reasoning steps to image coordinates and uses multi-turn zooming to outperform standard RL and supervised baselines on spatial and GUI reasoning benchmarks.