AudioMoG is a mixture-of-guidance sampling technique that combines CFG and AG signals to outperform single-guidance baselines in text-to-audio generation at equivalent speed.
Video-to-audio generation with hidden alignment
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.
AMAVA is an adaptive motion-aware video-to-audio framework that switches between scene descriptions and safety sound cues based on detected movement, with a user study showing increased confidence when added to a white cane.
citing papers explorer
-
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance
AudioMoG is a mixture-of-guidance sampling technique that combines CFG and AG signals to outperform single-guidance baselines in text-to-audio generation at equivalent speed.
-
Movie Gen: A Cast of Media Foundation Models
A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.
-
AMAVA: Adaptive Motion-Aware Video-to-Audio Framework for Visually-Impaired Assistance
AMAVA is an adaptive motion-aware video-to-audio framework that switches between scene descriptions and safety sound cues based on detected movement, with a user study showing increased confidence when added to a white cane.