FLUX that plays music,

Zhengcong Fei et al · 2024 · arXiv 2409.00587

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline

cs.SD · 2026-02-24 · unverdicted · novelty 7.0

MIDI-SAG generates consistent long-form singing accompaniments by feeding symbolic MIDI timing, chords, and structure labels into a compositional pipeline built from pre-trained modules.

SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering

cs.SD · 2025-08-05 · unverdicted · novelty 6.0

SonicMaster is a text-conditioned flow-matching generative model for unified music restoration and mastering, trained on a dataset of simulated degradations across equalization, dynamics, reverb, amplitude, and stereo.

Academic Text-to-Music Grand Challenge: Datasets, Baselines, and Evaluation Methods

cs.SD · 2026-05-20 · accept · novelty 5.0

The paper introduces the ATTM Grand Challenge with a CC-licensed instrumental subset of MTG-Jamendo, two tracks, and evaluation via FAD, CLAP, and a new Concept Coverage Score to support academic text-to-music research.

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

eess.AS · 2024-10-09 · unverdicted · novelty 5.0

F5-TTS generates natural speech from text via flow matching on DiT with simple text padding, ConvNeXt refinement, and sway sampling, trained on 100K hours multilingual data.

citing papers explorer

Showing 4 of 4 citing papers.

MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline cs.SD · 2026-02-24 · unverdicted · none · ref 10
MIDI-SAG generates consistent long-form singing accompaniments by feeding symbolic MIDI timing, chords, and structure labels into a compositional pipeline built from pre-trained modules.
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering cs.SD · 2025-08-05 · unverdicted · none · ref 4
SonicMaster is a text-conditioned flow-matching generative model for unified music restoration and mastering, trained on a dataset of simulated degradations across equalization, dynamics, reverb, amplitude, and stereo.
Academic Text-to-Music Grand Challenge: Datasets, Baselines, and Evaluation Methods cs.SD · 2026-05-20 · accept · none · ref 6
The paper introduces the ATTM Grand Challenge with a CC-licensed instrumental subset of MTG-Jamendo, two tracks, and evaluation via FAD, CLAP, and a new Concept Coverage Score to support academic text-to-music research.
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching eess.AS · 2024-10-09 · unverdicted · none · ref 94
F5-TTS generates natural speech from text via flow matching on DiT with simple text padding, ConvNeXt refinement, and sway sampling, trained on 100K hours multilingual data.

FLUX that plays music,

fields

years

verdicts

representative citing papers

citing papers explorer