ONOTE is a multi-format benchmark that applies a deterministic pipeline to expose a disconnect between perceptual accuracy and music-theoretic comprehension in leading omnimodal AI models.
MT3: Multi-task multitrack music transcription.International Conference on Learning Representations (ICLR), 2022.https://arxiv.org/abs/2111.03017
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SD 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
Break-the-Beat! renders drum MIDI audio that matches the timbre of a reference clip by fine-tuning a text-to-audio model with a content encoder and hybrid conditioning on a new paired dataset.
STRUM is a multi-stage neural audio-to-chart system that achieves F1 scores of 0.838 (drums), 0.694 (bass), 0.651 (guitar), and 0.539 (vocals) on a 30-song benchmark with released code and models.
citing papers explorer
-
ONOTE: Benchmarking Omnimodal Notation Processing for Expert-level Music Intelligence
ONOTE is a multi-format benchmark that applies a deterministic pipeline to expose a disconnect between perceptual accuracy and music-theoretic comprehension in leading omnimodal AI models.
-
Break-the-Beat! Controllable MIDI-to-Drum Audio Synthesis
Break-the-Beat! renders drum MIDI audio that matches the timbre of a reference clip by fine-tuning a text-to-audio model with a content encoder and hybrid conditioning on a new paired dataset.
-
STRUM: A Spectral Transcription and Rhythm Understanding Model for End-to-End Generation of Playable Rhythm-Game Charts
STRUM is a multi-stage neural audio-to-chart system that achieves F1 scores of 0.838 (drums), 0.694 (bass), 0.651 (guitar), and 0.539 (vocals) on a 30-song benchmark with released code and models.