Live Music Diffusion Models adapt bidirectional diffusion for interactive music generation via KV caching and ARC-Forcing, recovering and exceeding discrete autoregressive efficiency while enabling post-training alignment without RL.
Rave: A variational autoencoder for fast and high-quality neural audio synthesis
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
MixtureTT performs direct per-stem timbre transfer on polyphonic mixtures via a shared diffusion transformer, outperforming single-stem baselines on SATB choral data while eliminating cascaded separation errors.
LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.
A portable single-board-computer AI music platform and five case studies demonstrate that remapping inputs, interleaving fast and slow controls, small artist datasets, and cheap hardware can open new artist-centered design spaces for intelligent instruments.
Variational autoencoders generate jerk signals from torque inputs in electric drivetrains and outperform physics-based baselines without detailed parametrization.
A musical performance co-produces sound through dual feedback loops between a RAVE-based neural audio instrument and a recurrent neural control system, exploring shared agency with human performers.
citing papers explorer
-
Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators
Live Music Diffusion Models adapt bidirectional diffusion for interactive music generation via KV caching and ARC-Forcing, recovering and exceeding discrete autoregressive efficiency while enabling post-training alignment without RL.
-
Remix the Timbre: Diffusion-Based Style Transfer Across Polyphonic Stems
MixtureTT performs direct per-stem timbre transfer on polyphonic mixtures via a shared diffusion transformer, outperforming single-stem baselines on SATB choral data while eliminating cascaded separation errors.
-
Latent Fourier Transform
LatentFT uses latent-space Fourier transforms and frequency masking in diffusion autoencoders to enable timescale-specific manipulation of musical structure in generative models.
-
Opening the Design Space: Two Years of Performance with Intelligent Musical Instruments
A portable single-board-computer AI music platform and five case studies demonstrate that remapping inputs, interleaving fast and slow controls, small artist datasets, and cheap hardware can open new artist-centered design spaces for intelligent instruments.
-
Drivetrain simulation using variational autoencoders
Variational autoencoders generate jerk signals from torque inputs in electric drivetrains and outperform physics-based baselines without detailed parametrization.
-
Hu\'i S\`u: Co-constructing a Dual Feedback Apparatus
A musical performance co-produces sound through dual feedback loops between a RAVE-based neural audio instrument and a recurrent neural control system, exploring shared agency with human performers.