pith. sign in

arxiv: 1802.06182 · v1 · pith:2ELZ37BMnew · submitted 2018-02-17 · 📡 eess.AS · cs.LG· cs.SD· stat.ML

CREPE: A Convolutional Representation for Pitch Estimation

classification 📡 eess.AS cs.LGcs.SDstat.ML
keywords pitchcrepealgorithmconvolutionalfundamentalmodelperformingprocessing
0
0 comments X
read the original abstract

The task of estimating the fundamental frequency of a monophonic sound recording, also known as pitch tracking, is fundamental to audio processing with multiple applications in speech processing and music information retrieval. To date, the best performing techniques, such as the pYIN algorithm, are based on a combination of DSP pipelines and heuristics. While such techniques perform very well on average, there remain many cases in which they fail to correctly estimate the pitch. In this paper, we propose a data-driven pitch tracking algorithm, CREPE, which is based on a deep convolutional neural network that operates directly on the time-domain waveform. We show that the proposed model produces state-of-the-art results, performing equally or better than pYIN. Furthermore, we evaluate the model's generalizability in terms of noise robustness. A pre-trained version of CREPE is made freely available as an open-source Python module for easy application.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Mechanisms of Misgeneralization in Physical Sequence Modeling

    cs.LG 2026-05 unverdicted novelty 6.0

    Generative sequence models for physical tasks exhibit physical misgeneralization where local prediction errors propagate through physical measurements to distort aggregate distributions over quantities like distance o...