Coca: Contrastive captioners are image-text foundation models.Transactions on Machine Learning Research, 2022

Jiahui Yu, Zirui Wang, Vijay Vasudevan, Legg Yeung, Mojtaba Seyedhosseini, Yonghui Wu · 2022

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping

cs.CV · 2025-05-19 · unverdicted · novelty 7.0

A contrastive multimodal framework augments satellite-audio datasets with vision-language model sound descriptions to learn shared soundscape concepts for zero-shot retrieval and synthesis.

citing papers explorer

Showing 1 of 1 citing paper.

Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping cs.CV · 2025-05-19 · unverdicted · none · ref 45
A contrastive multimodal framework augments satellite-audio datasets with vision-language model sound descriptions to learn shared soundscape concepts for zero-shot retrieval and synthesis.

Coca: Contrastive captioners are image-text foundation models.Transactions on Machine Learning Research, 2022

fields

years

verdicts

representative citing papers

citing papers explorer