9 Omnimae: Single model masked pretraining on images and videos

Rohit Girdhar, Alaaeldin El-Nouby, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, Ishan Misra · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CV · 2025-12-19 · conditional · novelty 7.0

MMLandmarks supplies 197k aerial and 329k ground images plus text and GPS for 18,557 landmarks to benchmark multimodal geo-spatial understanding.

Showing 1 of 1 citing paper.

MMLANDMARKS: a Cross-View Instance-Level Benchmark for Geo-Spatial Understanding cs.CV · 2025-12-19 · conditional · none · ref 24
MMLandmarks supplies 197k aerial and 329k ground images plus text and GPS for 18,557 landmarks to benchmark multimodal geo-spatial understanding.