arXiv preprint arXiv:2601.09499 (2025)

Sucar, E · 2026 · arXiv 2601.09499

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking

cs.CV · 2026-05-12 · unverdicted · novelty 8.0

TrackCraft3R is the first method to repurpose a video diffusion transformer as a feed-forward dense 3D tracker via dual-latent representations and temporal RoPE alignment, achieving SOTA performance with lower compute.

Face Anything: 4D Face Reconstruction from Any Image Sequence

cs.CV · 2026-04-21 · unverdicted · novelty 7.0

A single transformer model jointly predicts depth and normalized canonical coordinates to deliver state-of-the-art 4D facial geometry and tracking with 3x lower correspondence error and 16% better depth accuracy.

LongDPM: Overlap-Aware 4D Reconstruction from Long Monocular Videos

cs.CV · 2026-05-17 · unverdicted · novelty 6.0

LongDPM introduces an overlap-aware chunk-based framework that registers and fuses local dynamic reconstructions to achieve coherent long-range 4D geometry and tracking from monocular video.

citing papers explorer

Showing 3 of 3 citing papers.

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking cs.CV · 2026-05-12 · unverdicted · none · ref 65
TrackCraft3R is the first method to repurpose a video diffusion transformer as a feed-forward dense 3D tracker via dual-latent representations and temporal RoPE alignment, achieving SOTA performance with lower compute.
Face Anything: 4D Face Reconstruction from Any Image Sequence cs.CV · 2026-04-21 · unverdicted · none · ref 58
A single transformer model jointly predicts depth and normalized canonical coordinates to deliver state-of-the-art 4D facial geometry and tracking with 3x lower correspondence error and 16% better depth accuracy.
LongDPM: Overlap-Aware 4D Reconstruction from Long Monocular Videos cs.CV · 2026-05-17 · unverdicted · none · ref 28
LongDPM introduces an overlap-aware chunk-based framework that registers and fuses local dynamic reconstructions to achieve coherent long-range 4D geometry and tracking from monocular video.

arXiv preprint arXiv:2601.09499 (2025)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer