DA3 recovers consistent visual geometry from arbitrary views via a vanilla DINO transformer and depth-ray target, setting new SOTA on a visual geometry benchmark while outperforming DA2 on monocular depth.
Pointodyssey: A large-scale synthetic dataset for long-term point tracking
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
fields
cs.CV 2verdicts
UNVERDICTED 2roles
dataset 1polarities
use dataset 1representative citing papers
ST-Gen4D uses a world model that fuses global appearance and local dynamic graphs into a 4D cognition representation to guide consistent 4D Gaussian generation.
citing papers explorer
-
Depth Anything 3: Recovering the Visual Space from Any Views
DA3 recovers consistent visual geometry from arbitrary views via a vanilla DINO transformer and depth-ray target, setting new SOTA on a visual geometry benchmark while outperforming DA2 on monocular depth.
-
ST-Gen4D: Embedding 4D Spatiotemporal Cognition into World Model for 4D Generation
ST-Gen4D uses a world model that fuses global appearance and local dynamic graphs into a 4D cognition representation to guide consistent 4D Gaussian generation.