pith. sign in

Video-cot: A comprehensive dataset for spatiotemporal understanding of videos based on chain-of- thought

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CV 1 cs.RO 1

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

OneVLA: A Unified Framework for Embodied Tasks

cs.RO · 2026-05-31 · unverdicted · novelty 6.0

OneVLA is a unified VLA model using a shared action head and multi-stage progressive training with CoT fine-tuning that reports state-of-the-art results on both navigation and manipulation in simulation and real-world settings.

citing papers explorer

Showing 2 of 2 citing papers after filters.