pith. sign in

arxiv: 1804.09202 · v1 · pith:7STCMBAXnew · submitted 2018-04-24 · 💻 cs.SD · eess.AS

Vocal melody extraction using patch-based CNN

classification 💻 cs.SD eess.AS
keywords modelpatch-baseddataextractionmelodyrepresentationvocalaccuracy
0
0 comments X
read the original abstract

A patch-based convolutional neural network (CNN) model presented in this paper for vocal melody extraction in polyphonic music is inspired from object detection in image processing. The input of the model is a novel time-frequency representation which enhances the pitch contours and suppresses the harmonic components of a signal. This succinct data representation and the patch-based CNN model enable an efficient training process with limited labeled data. Experiments on various datasets show excellent speed and competitive accuracy comparing to other deep learning approaches.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.