Tcsinger: Zero-shot singing voice synthesis with style transfer and multi-level style control,

· 2024 · DOI 10.18653/v1/2024.emnlp-main.117

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

MeloDISinger: Melody-Aware & Duration-Preserving Singing Voice Editing with Audio Infilling

eess.AS · 2026-06-29 · unverdicted · novelty 4.0

Proposes MeloDISinger, a flow-matching SVE model with MeloDRP for melody-aware duration-preserving editing and audio infilling, claiming SOTA results.

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

eess.AS · 2026-05-29 · unverdicted · novelty 4.0

SwanSphere introduces a causal autoregressive diffusion transformer architecture with SVAC contrastive learning and ODPO optimization for streaming spatial audio generation from video and text.

citing papers explorer

Showing 2 of 2 citing papers.

MeloDISinger: Melody-Aware & Duration-Preserving Singing Voice Editing with Audio Infilling eess.AS · 2026-06-29 · unverdicted · none · ref 22
Proposes MeloDISinger, a flow-matching SVE model with MeloDRP for melody-aware duration-preserving editing and audio infilling, claiming SOTA results.
Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer eess.AS · 2026-05-29 · unverdicted · none · ref 56
SwanSphere introduces a causal autoregressive diffusion transformer architecture with SVAC contrastive learning and ODPO optimization for streaming spatial audio generation from video and text.

Tcsinger: Zero-shot singing voice synthesis with style transfer and multi-level style control,

fields

years

verdicts

representative citing papers

citing papers explorer