pith. sign in

Video-to-audio generation with hidden alignment

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

fields

cs.CV 2 cs.SD 1

verdicts

UNVERDICTED 3

representative citing papers

Movie Gen: A Cast of Media Foundation Models

cs.CV · 2024-10-17 · unverdicted · novelty 5.0

A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.

citing papers explorer

Showing 3 of 3 citing papers.

  • AudioMoG: Guiding Audio Generation with Mixture-of-Guidance cs.SD · 2025-09-28 · unverdicted · none · ref 72

    AudioMoG is a mixture-of-guidance sampling technique that combines CFG and AG signals to outperform single-guidance baselines in text-to-audio generation at equivalent speed.

  • Movie Gen: A Cast of Media Foundation Models cs.CV · 2024-10-17 · unverdicted · none · ref 76

    A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.

  • AMAVA: Adaptive Motion-Aware Video-to-Audio Framework for Visually-Impaired Assistance cs.CV · 2026-04-26 · unverdicted · none · ref 8

    AMAVA is an adaptive motion-aware video-to-audio framework that switches between scene descriptions and safety sound cues based on detected movement, with a user study showing increased confidence when added to a white cane.