pith. sign in

Seeing and hearing: Open-domain visual-audio generation with diffusion latent aligners

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.SD 2

years

2025 2

clear filters

representative citing papers

StereoFoley: Object-Aware Stereo Audio Generation from Video

cs.SD · 2025-09-22 · conditional · novelty 7.0

StereoFoley is an end-to-end video-to-stereo-audio framework that uses a base generative model fine-tuned on synthetic object-tracked data with panning and distance controls to achieve object-aware spatial sound.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • StereoFoley: Object-Aware Stereo Audio Generation from Video cs.SD · 2025-09-22 · conditional · none · ref 11

    StereoFoley is an end-to-end video-to-stereo-audio framework that uses a base generative model fine-tuned on synthetic object-tracked data with panning and distance controls to achieve object-aware spatial sound.