Title resolution pending

Styletts: A style-based generative model for natural, diverse text-to-speech synthesis · 2025 · arXiv 2505.16076

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

MMAE: A Massive Multitask Audio Editing Benchmark

cs.SD · 2026-06-05 · conditional · novelty 8.0

MMAE is a new multitask audio editing benchmark showing that leading models achieve under 5% exact match rate, with 0% on complex mixed-modality tasks.

Hybrid Diffusion Transformer for Instruction-Guided Audio Editing via Rectified Flow

cs.SD · 2026-06-18 · unverdicted · novelty 6.0 · 2 refs

Hybrid two-stage diffusion transformer architecture for instruction-guided audio editing via rectified flow that performs joint attention at low resolution then alternates joint and cross-attention at high resolution for improved performance and efficiency.

Audio Editing in the Era of Foundation Models: A Survey

eess.AS · 2026-06-22 · unverdicted · novelty 3.0

A survey that presents a unified taxonomy of audio editing tasks, summarizes training-based and training-free foundation model approaches, reviews datasets and evaluation protocols, and identifies future challenges.

citing papers explorer

Showing 3 of 3 citing papers.

MMAE: A Massive Multitask Audio Editing Benchmark cs.SD · 2026-06-05 · conditional · none · ref 22
MMAE is a new multitask audio editing benchmark showing that leading models achieve under 5% exact match rate, with 0% on complex mixed-modality tasks.
Hybrid Diffusion Transformer for Instruction-Guided Audio Editing via Rectified Flow cs.SD · 2026-06-18 · unverdicted · none · ref 44 · 2 links
Hybrid two-stage diffusion transformer architecture for instruction-guided audio editing via rectified flow that performs joint attention at low resolution then alternates joint and cross-attention at high resolution for improved performance and efficiency.
Audio Editing in the Era of Foundation Models: A Survey eess.AS · 2026-06-22 · unverdicted · none · ref 9
A survey that presents a unified taxonomy of audio editing tasks, summarizes training-based and training-free foundation model approaches, reviews datasets and evaluation protocols, and identifies future challenges.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer