A Manual Bar-by-Bar Tempo Measurement Protocol for Polyphonic Chamber Music Recordings: Design, Validation, and Application to Beethoven's Piano and Cello Sonatas
Pith reviewed 2026-05-10 09:15 UTC · model grok-4.3
The pith
A cumulative lap-timer protocol produces bar-by-bar beats-per-minute data with millisecond resolution from polyphonic historical recordings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The protocol yields bar-level beats-per-minute data with millisecond resolution, permits internal self-validation, and captures expressive timing phenomena (rubato, fermatas, accelerandi, ritardandi) that automated tools systematically suppress or misread. It rests on a cumulative timestamp architecture developed in collaboration with an engineer, includes a derived BPM formula and spreadsheet structure, and was used to process duo recordings of Beethoven's piano and cello sonatas from 1930 to 2012.
What carries the argument
The cumulative timestamp architecture that records running totals to each bar boundary, preventing error accumulation during manual annotation of polyphonic audio.
If this is right
- The mathematical derivation of the BPM formula and the spreadsheet data structure enable direct replication on additional recordings.
- The generated dataset supports quantitative comparisons of tempo across performers and eras through tempographs and ridgeline plots.
- The protocol records expressive features such as accelerandi and ritardandi that would otherwise be lost in automated analysis of chamber music.
- Public release of the dataset and analysis code allows extension to other movements or similar polyphonic works.
Where Pith is reading between the lines
- The same cumulative timestamp approach could be tested on other historical polyphonic genres, such as string quartets, to check whether it remains effective beyond duo sonatas.
- Running the protocol on modern studio recordings of the same works would allow direct comparison of expressive timing differences between eras.
- Adding a second annotation layer for dynamics or articulation at the same bar points might produce richer performance datasets with minimal extra effort.
- Statistical measures of agreement across multiple annotators could be added to quantify the reliability of the bar-boundary identifications.
Load-bearing premise
Human annotators can consistently and accurately identify bar boundaries in polyphonic historical audio without introducing systematic bias or fatigue-related errors that the cumulative timestamp architecture cannot fully mitigate.
What would settle it
Independent runs of the protocol by two different annotators on the same recordings, followed by direct comparison of the resulting per-bar BPM sequences, would show whether timings and values match within acceptable error bounds.
Figures
read the original abstract
Empirical performance analysis depends on the accurate extraction of tempo data from recordings, yet standard computational tools, designed for monophonic audio or modern studio conditions, fail systematically when applied to historical polyphonic chamber music. This paper documents the failure of automated beat-detection software on duo recordings of Beethoven's five piano and cello sonatas (Op.~5 Nos.~1 and~2; Op.~69; Op.~102 Nos.~1 and~2), and presents a formalised manual alternative: a cumulative lap-timer protocol that yields bar-level beats-per-minute data with millisecond resolution. The protocol, developed in cross-disciplinary collaboration with an engineer specialising in VLSI design, rests on a cumulative timestamp architecture that prevents error accumulation, permits internal self-validation, and captures expressive timing phenomena (rubato, fermatas, accelerandi, ritardandi) that automated tools systematically suppress or misread. The mathematical derivation of the BPM formula, the spreadsheet data structure, and the error characterisation are presented in full. Applied to over one hundred movement-level recordings spanning 1930--2012, the protocol generated a dataset subsequently visualised through tempographs, histograms with spline-smoothed probability density functions, ridgeline plots, and combination charts. The paper argues that manual annotation is not a methodological retreat but a principled response to the intrinsic limitations of computational tools when faced with the specific challenges of polyphonic historical recordings. The complete dataset and analysis code are publicly available.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a formal manual protocol for bar-by-bar tempo measurement in polyphonic historical chamber music recordings, using a cumulative timestamp (lap-timer) architecture to achieve millisecond-resolution BPM data. It documents the systematic failure of automated beat-detection tools on Beethoven piano-cello sonata recordings, derives the BPM formula mathematically, characterises errors, applies the method to over 100 movement-level recordings (1930–2012), generates visualisations (tempographs, ridgeline plots, etc.), and releases the full dataset and analysis code publicly. The central claim is that this manual approach is a principled solution for capturing expressive timing (rubato, fermatas, accelerandi) that automation suppresses.
Significance. If the protocol's claimed reliability holds, the work supplies a reproducible, high-resolution method for extracting expressive tempo data from polyphonic historical audio where current computational tools are inadequate, directly supporting empirical performance analysis. The public release of the complete dataset and code is a clear strength that enables independent verification and reuse.
major comments (1)
- [Abstract and error characterisation section] Abstract and error characterisation section: the paper asserts that the protocol 'permits internal self-validation' and presents 'the error characterisation ... in full,' yet no quantitative metrics are reported (e.g., inter-annotator agreement coefficients, repeated-annotation error rates, or comparison of annotated bar onsets against score-derived positions). This is load-bearing for the central claim, because the protocol's advantage over automated tools rests on demonstrating that human boundary placement is sufficiently consistent to avoid systematic bias or fatigue effects that would propagate into the BPM curves and tempographs.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. We address the single major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: [Abstract and error characterisation section] Abstract and error characterisation section: the paper asserts that the protocol 'permits internal self-validation' and presents 'the error characterisation ... in full,' yet no quantitative metrics are reported (e.g., inter-annotator agreement coefficients, repeated-annotation error rates, or comparison of annotated bar onsets against score-derived positions). This is load-bearing for the central claim, because the protocol's advantage over automated tools rests on demonstrating that human boundary placement is sufficiently consistent to avoid systematic bias or fatigue effects that would propagate into the BPM curves and tempographs.
Authors: We agree that the absence of quantitative metrics is a gap. The internal self-validation arises from the cumulative timestamp (lap-timer) architecture, which makes any boundary inconsistency immediately visible in subsequent bar timings and prevents cumulative drift; this is described in the protocol section and error characterisation. The error characterisation section details qualitative sources of error (onset ambiguity in polyphonic textures, potential fatigue) and the design mitigations. However, no numerical coefficients or repeated-annotation statistics are provided. We will revise by re-annotating a stratified subset of at least 20 movements after a minimum one-month delay, compute bar-onset timing differences and derived BPM discrepancies, and report mean absolute error, standard deviation, and any systematic trends. We will also add a brief discussion of why direct comparison to score-derived onsets is limited for expressive historical performances. These quantitative results and the discussion will be added to the error characterisation section and referenced in the abstract. This revision directly addresses the load-bearing concern while preserving the protocol's core description. revision: yes
Circularity Check
No circularity: manual protocol and BPM derivation are independent of outputs
full rationale
The paper describes a cumulative lap-timer protocol for manual bar-boundary annotation in polyphonic recordings, with a standard BPM formula derived directly from measured timestamps (BPM = 60 / inter-bar interval in seconds, aggregated per bar). This is a direct calculation from human-placed timestamps rather than any fitted parameter, self-referential definition, or prediction that reduces to the input data by construction. No self-citations are invoked as load-bearing uniqueness theorems, no ansatz is smuggled, and no known empirical pattern is merely renamed. Internal self-validation refers to consistency checks on the cumulative timestamps themselves, not a redefinition of the measured quantities. The protocol stands as an independent measurement method whose outputs (BPM curves) are not presupposed in its design.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Human listeners can accurately and consistently detect bar boundaries in polyphonic historical recordings
Forward citations
Cited by 4 Pith papers
-
Towards Revised Tempo Indications for Beethoven's Piano and Cello Sonatas: Czerny, Moscheles, Kolisch, and Recorded Practice 1930-2012
Historical tempo indications for Beethoven sonatas are exceeded by 15-39% in recordings, Kolisch's markings align better, and stable mid-range traditions support proposed revised tempo ranges reflecting actual perform...
-
Spectrographic Portamento Gradient Analysis: A Quantitative Method for Historical Cello Recordings with Application to Beethoven's Piano and Cello Sonatas, 1930--2012
A spectrographic method quantifies portamento gradient in Hz/s and finds it correlates negatively with tempo across 22 historical recordings of Beethoven cello sonatas from 1930-2012.
-
Coexisting Tempo Traditions in Beethoven's Piano and Cello Sonatas: A K-means Clustering Analysis of Recorded Performances, 1930-2012
K-means clustering of historical Beethoven sonata recordings identifies multiple stable tempo traditions per movement that persist independently over eight decades instead of showing uniform change.
-
A Complementary Visualisation Suite for Empirical Performance Analysis: Tempographs, Histograms, Ridgeline Plots, Stacked Bar Charts, and Combination Charts Applied to Beethoven's Piano and Cello Sonatas
The paper introduces and demonstrates a complementary set of five visualizations including tempographs, spline-smoothed histograms, ridgeline plots, stacked bar charts, and combination charts for bar-level tempo data ...
Reference graph
Works this paper leans on
-
[1]
Tempo, Duration and Flexibility: Tech- niques in the Analysis of Performance,
J. A. Bowen, “Tempo, Duration and Flexibility: Tech- niques in the Analysis of Performance,”Journal of Musi- cological Research, vol. 16, pp. 111–156, 1996
work page 1996
-
[2]
C. Cannam, C. Landone, and M. Sandler, “Sonic Vi- sualiser: An Open Source Application for Viewing, Analysing, and Annotating Music Audio Files,” inPro- ceedings of the ACM International Conference on Multi- media, Florence, 2010, pp. 1467–1468
work page 2010
-
[3]
A Musicologist’s Guide to Sonic Visualizer,
N. Cook and D. Leech-Wilkinson, “A Musicologist’s Guide to Sonic Visualizer,” CHARM, 2009. [Online]. Available:https://charm.rhul.ac.uk/analysing/p9_ 1.html
work page 2009
-
[4]
Performance Analysis and Chopin’s Mazurkas,
N. Cook, “Performance Analysis and Chopin’s Mazurkas,”Musicae Scientiae, vol.11, no.2, pp.183–207, 2007
work page 2007
-
[5]
Characterizing Tempo Change in Musical Performances,
R. Dannenberg and S. Mohan, “Characterizing Tempo Change in Musical Performances,” inProceedings of the International Computer Music Conference 2011, Univer- sityofHuddersfield, July31–August5, 2011, pp.650–656
work page 2011
-
[6]
Patterns of Expressive Timing in Performances of a Beethoven Minuet by Nineteen Famous Pianists,
B. Repp, “Patterns of Expressive Timing in Performances of a Beethoven Minuet by Nineteen Famous Pianists,” Journal of the Acoustical Society of America, vol. 88, no. 2, pp. 622–641, 1990
work page 1990
-
[7]
J. B. McEwen,Tempo Rubato or Time-Variation in Musical Performance. Oxford: Oxford University Press, 1928
work page 1928
-
[8]
Untersuchungen über das metrische Verhalten in musikalischen Interpretation Varianten,
W. Goebl,Melody Lead in Piano Performance: Expres- sive Device or Artifact?Vienna: Austrian Research In- stitute for Artificial Intelligence, 2001, citing A. Hart- mann, “Untersuchungen über das metrische Verhalten in musikalischen Interpretation Varianten,”Archiv für die gesamte Psychologie, vol. 84, pp. 103–192, 1932
work page 2001
-
[9]
An Interdisciplinary Review of Music Performance Analy- sis,
A. Pati, A. Lerch, C. Arthur, and S. Gururani, “An Interdisciplinary Review of Music Performance Analy- sis,”Transactions of the International Society for Music Information Retrieval, vol. 3, pp. 221–245, 2020. DOI: 10.5334/tismir.53
-
[10]
B. W. Silverman,Density Estimation for Statistics and Data Analysis. London: Chapman & Hall, 1986
work page 1986
-
[11]
Matplotlib: A 2D Graphics Environ- ment,
J. D. Hunter, “Matplotlib: A 2D Graphics Environ- ment,”Computing in Science & Engineering, vol.9, no.3, pp. 90–95, 2007
work page 2007
-
[12]
Seaborn: StatisticalDataVisualization,
M.L.Waskom, “Seaborn: StatisticalDataVisualization,” Journal of Open Source Software, vol. 6, no. 60, p. 3021, 2021
work page 2021
-
[13]
Data Structures for Statistical Comput- ing in Python,
W. McKinney, “Data Structures for Statistical Comput- ing in Python,” inProceedings of the 9th Python in Sci- ence Conference, S. van der Walt and J. Millman, Eds. Austin, TX: SciPy, 2010, pp. 51–56
work page 2010
-
[14]
Katz,Capturing Sound: How Technology Has Changed Music
M. Katz,Capturing Sound: How Technology Has Changed Music. Berkeley: University of California Press, 2004
work page 2004
-
[15]
Noorduin,Beethoven’s Tempo Indications
M. Noorduin,Beethoven’s Tempo Indications. PhD dis- sertation, University of Manchester, 2016. [Online]. Available:https://www.escholar.manchester.ac.uk/ uk-ac-man-scw:302884
work page 2016
-
[16]
TempoandCharacterinBeethoven’sMusic,
R.Kolisch, “TempoandCharacterinBeethoven’sMusic,” The Musical Quarterly, vol. 77, no. 1, pp. 90–131, Spring 1993
work page 1993
-
[17]
Temporal Structure of Performed Mu- sic: Some Preliminary Observations,
D.-J. Povel, “Temporal Structure of Performed Mu- sic: Some Preliminary Observations,”Acta Psychologica, vol. 41, pp. 309–320, 1977
work page 1977
-
[18]
Philip,Early Recordings and Musical Style: Changing Tastes in Instrumental Performance, 1900–1950
R. Philip,Early Recordings and Musical Style: Changing Tastes in Instrumental Performance, 1900–1950. Cam- bridge: Cambridge University Press, 1992
work page 1900
-
[19]
Leech-Wilkinson,The Changing Sound of Music: Approaches to Studying Recorded Musical Performance
D. Leech-Wilkinson,The Changing Sound of Music: Approaches to Studying Recorded Musical Performance. London: CHARM, 2009. [Online]. Available:https:// www.charm.kcl.ac.uk/studies/chapters/chap5.html
work page 2009
-
[20]
Jupyter Notebooks, a Publishing For- mat for Reproducible Computational Workflows,
T. Kluyver et al., “Jupyter Notebooks, a Publishing For- mat for Reproducible Computational Workflows,” inPo- sitioning and Power in Academic Publishing: Players, Agents and Agendas, F. Loizides and B. Schmidt, Eds. Amsterdam: IOS Press, 2016, pp. 87–90
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.