Omni-Embed-Audio uses multimodal LLMs to match CLAP on standard audio retrieval while improving text-to-text retrieval by 22% relative and hard negative discrimination by 4.3 points HNSR@10 on user-intent queries.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Omni-Embed-Audio: Leveraging Multimodal LLMs for Robust Audio-Text Retrieval
Omni-Embed-Audio uses multimodal LLMs to match CLAP on standard audio retrieval while improving text-to-text retrieval by 22% relative and hard negative discrimination by 4.3 points HNSR@10 on user-intent queries.