R-Transformer integrates RNNs with multi-head attention to model local and global sequence dependencies without position embeddings and reports large-margin gains over prior methods on diverse tasks.
Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
We investigate the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation. We introduce a probabilistic model based on distribution estimators conditioned on a recurrent neural network that is able to discover temporal dependencies in high-dimensional sequences. Our approach outperforms many traditional models of polyphonic music on a variety of realistic datasets. We show how our musical language model can serve as a symbolic prior to improve the accuracy of polyphonic transcription.
years
2019 2verdicts
UNVERDICTED 2representative citing papers
MIDI-Sandwich is a hierarchical VAE-GAN architecture that generates structured 136-beat melodies by modeling local bars and global relationships on the Nottingham dataset.
citing papers explorer
-
R-Transformer: Recurrent Neural Network Enhanced Transformer
R-Transformer integrates RNNs with multi-head attention to model local and global sequence dependencies without position embeddings and reports large-margin gains over prior methods on diverse tasks.
-
MIDI-Sandwich: Multi-model Multi-task Hierarchical Conditional VAE-GAN networks for Symbolic Single-track Music Generation
MIDI-Sandwich is a hierarchical VAE-GAN architecture that generates structured 136-beat melodies by modeling local bars and global relationships on the Nottingham dataset.