The paper defines the 4DVLT task for worldline-centered 4D scene understanding, releases Instruct-4D with 129.4K QA pairs, and presents 4DTrack achieving 62.68 TGA_Top1, outperforming adapted baselines by 19.62 points.
Causal discov- ery in heterogeneous environments under the sparse mech- anism shift hypothesis,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
ECR-Net is a framework that discovers and adapts causal structures in non-stationary data by evolving gene-regulatory-network-like graphs via fitness-optimized search.
citing papers explorer
-
4DVLT: Dynamic Scene Understanding with Worldline-Centered Vision-Language Tracking
The paper defines the 4DVLT task for worldline-centered 4D scene understanding, releases Instruct-4D with 129.4K QA pairs, and presents 4DTrack achieving 62.68 TGA_Top1, outperforming adapted baselines by 19.62 points.