LENS shapes low-frequency eigen noise with a lightweight network to enable efficient, high-quality sampling in distilled diffusion models.
Ditto: Diffusion inference-time t-optimization for music generation.arXiv preprint arXiv:2401.12179
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.
Diffusion models improve generation quality via inference-time search over noise candidates guided by verifiers and algorithms, yielding gains beyond denoising step scaling on class- and text-conditioned benchmarks.
citing papers explorer
-
LENS: Low-Frequency Eigen Noise Shaping for Efficient Diffusion Sampling
LENS shapes low-frequency eigen noise with a lightweight network to enable efficient, high-quality sampling in distilled diffusion models.
-
Latent Fourier Transform
LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.
-
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
Diffusion models improve generation quality via inference-time search over noise candidates guided by verifiers and algorithms, yielding gains beyond denoising step scaling on class- and text-conditioned benchmarks.