GeoViSTA learns unified geospatial embeddings from co-registered imagery and tabular data via bilateral cross-attention and joint masked autoencoding, yielding better linear probing performance on mortality and fire hazard prediction tasks.
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GeoViSTA: Geospatial Vision-Tabular Transformer for Multimodal Environment Representation
GeoViSTA learns unified geospatial embeddings from co-registered imagery and tabular data via bilateral cross-attention and joint masked autoencoding, yielding better linear probing performance on mortality and fire hazard prediction tasks.