Sheet Music Benchmark: Standardized Optical Music Recognition Evaluation

Craig Sapp; David Rizo; Greg Chapman; Joan Cerveto-Serrano; Jorge Calvo-Zaragoza; Juan C. Martinez-Sevilla; Noelia Luna

arxiv: 2506.10488 · v3 · pith:SCE4FXXQnew · submitted 2025-06-12 · 💻 cs.CV · cs.DL· cs.IR

Sheet Music Benchmark: Standardized Optical Music Recognition Evaluation

Juan C. Martinez-Sevilla , Joan Cerveto-Serrano , Noelia Luna , Greg Chapman , Craig Sapp , David Rizo , Jorge Calvo-Zaragoza This is my paper

classification 💻 cs.CV cs.DLcs.IR

keywords musicbenchmarkomr-neddataseterrorevaluationintroducemusical

0 comments

read the original abstract

In this work, we introduce the Sheet Music Benchmark (SMB), a dataset of six hundred and eighty-five pages specifically designed to benchmark Optical Music Recognition (OMR) research. SMB encompasses a diverse array of musical textures, including monophony, pianoform, quartet, and others, all encoded in Common Western Modern Notation using the Humdrum **kern format. Alongside SMB, we introduce the OMR Normalized Edit Distance (OMR-NED), a new metric tailored explicitly for evaluating OMR performance. OMR-NED builds upon the widely-used Symbol Error Rate (SER), offering a fine-grained and detailed error analysis that covers individual musical elements such as note heads, beams, pitches, accidentals, and other critical notation features. The resulting numeric score provided by OMR-NED facilitates clear comparisons, enabling researchers and end-users alike to identify optimal OMR approaches. Our work thus addresses a long-standing gap in OMR evaluation, and we support our contributions with baseline experiments using standardized SMB dataset splits for training and assessing state-of-the-art methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Rubato: Transcribing Piano Music with Timestamps
cs.SD 2026-05 unverdicted novelty 6.0

Rubato model with InterMo representation outperforms cascade methods in generating timestamped piano sheet music from audio, even when cascades receive ground-truth MIDI.
A High-Accuracy Optical Music Recognition Method Based on Bottleneck Residual Convolutions
cs.CV 2026-04 unverdicted novelty 3.0

A CNN using ResNet-v2-style residual bottleneck blocks and multi-scale dilated convolutions followed by BiGRU and CTC loss achieves SeER of 7.52% and SyER of 0.45% on the Camera-PrIMuS dataset for optical music recognition.