Revis- iting multimodal representation in contrastive learning: from patch and token embeddings to finite discrete tokens

Yuxiao Chen, Jianbo Yuan, Yu Tian, Shijie Geng, Xinyu Li, Ding Zhou, Dimitris N Metaxas, Hongxia Yang · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping

cs.CV · 2025-05-19 · unverdicted · novelty 7.0

A contrastive multimodal framework augments satellite-audio datasets with vision-language model sound descriptions to learn shared soundscape concepts for zero-shot retrieval and synthesis.

citing papers explorer

Showing 1 of 1 citing paper.

Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping cs.CV · 2025-05-19 · unverdicted · none · ref 5
A contrastive multimodal framework augments satellite-audio datasets with vision-language model sound descriptions to learn shared soundscape concepts for zero-shot retrieval and synthesis.

Revis- iting multimodal representation in contrastive learning: from patch and token embeddings to finite discrete tokens

fields

years

verdicts

representative citing papers

citing papers explorer