A text-to-audio generative model is adapted for room impulse response generation using vision-language model labeling of image-RIR datasets and in-context learning for free-form prompts.
Finite-difference time-domain simulation of low-frequency room acoustic problems,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Adapting a Text-to-Audio Model for Room Impulse Response Generation
A text-to-audio generative model is adapted for room impulse response generation using vision-language model labeling of image-RIR datasets and in-context learning for free-form prompts.