Cross4D-JEPA uses dense projection-based cross-modal correspondence to distill features from DINOv2 or V-JEPA 2 into a 4D point encoder, outperforming intra-modal and global cross-modal baselines on four benchmarks while improving label efficiency.
Self-supervised JEPA-based world models for LiDAR occupancy comple- tion and forecasting.Computing Research Repository, arXiv Preprints, arXiv:2602.12540, pages 1–9, 2026
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
TERRA formalizes cross-domain transfer for action-conditioned latent predictors via controlled Markov processes and bisimulation metrics, states a falsifiable Structured-State Transfer Hypothesis, and outlines a preregistered experimental program with no empirical results presented.
citing papers explorer
-
Cross4D-JEPA: Dense Cross-modal Correspondence Distillation for 4D Point Cloud Representation Learning
Cross4D-JEPA uses dense projection-based cross-modal correspondence to distill features from DINOv2 or V-JEPA 2 into a 4D point encoder, outperforming intra-modal and global cross-modal baselines on four benchmarks while improving label efficiency.