arxiv: 2605.03541 · v1 · submitted 2026-05-05 · 💻 cs.SD · cs.IR

Recognition: unknown

Cosmodoit: A Python Package for Adaptive, Efficient Pipelining of Feature Extraction from Performed Music

Corentin Guichaoua , Daniel Bedoya , Elaine Chew

Authors on Pith no claims yet

Pith reviewed 2026-05-07 12:48 UTC · model grok-4.3

classification 💻 cs.SD cs.IR

keywords music performance analysisfeature extractionperformance-to-score alignmentPython packagemodular pipelinemusic information retrievalsymbolic featuresaudio features

0 comments

The pith

Cosmodoit is a Python package that integrates performance-to-score alignment with symbolic and audio feature extraction in one modular pipeline.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Cosmodoit to solve the problem of tools for music performance analysis being scattered across languages and formats, which makes combining them inefficient. It establishes that a single extensible pipeline can handle alignment plus feature extraction while supporting selective processing, dependency tracking, and incremental updates. This matters because it cuts repeated computation, lowers integration errors, and scales analysis to large collections of performances. The design accommodates existing algorithms from different sources and permits parameter adjustments for consistent outputs.

Core claim

Cosmodoit integrates performance-to-score alignment with symbolic and audio feature extraction in a modular, flexible pipeline that supports selective processing, dependency-aware computation, and incremental updates. Its extensible design reduces duplicated work, minimizes errors, and enables efficient large-scale processing. By accommodating algorithms implemented in multiple languages and allowing parameter tuning for consistent feature extraction, Cosmodoit provides a versatile and practical tool for both research and development in music performance analysis.

What carries the argument

Cosmodoit, the Python package whose modular architecture integrates alignment and extraction tasks while managing dependencies and incremental recomputation.

Load-bearing premise

That existing algorithms implemented in multiple languages and data formats can be integrated into a single extensible pipeline without introducing significant compatibility issues, performance overhead, or loss of accuracy in feature extraction.

What would settle it

A controlled test on a fixed set of music performances where identical features are extracted once via Cosmodoit and once via separate manual scripts, then compared for exact match in outputs and reduction in total runtime.

Figures

Figures reproduced from arXiv: 2605.03541 by Corentin Guichaoua, Daniel Bedoya, Elaine Chew.

**Figure 1.** Figure 1: Cosmodoit system processes initiated by a single command-line call. Example based on Raoul Pugno’s performance view at source ↗

read the original abstract

Computational analysis of performed music is a key component of music information research, as performance shapes much of the music we hear. Music performance analysis studies the acoustic variations introduced by performers and how these variations reflect musical interpretation and structure. Although many algorithms and tools exist for tasks such as performance-to-score alignment and symbolic or audio feature extraction, they are spread across different programming languages and data formats, making them difficult to combine efficiently. To address this problem, we present Cosmodoit, a novel Python package designed to streamline feature extraction from performed music. Cosmodoit integrates performance-to-score alignment with symbolic and audio feature extraction in a modular, flexible pipeline that supports selective processing, dependency-aware computation, and incremental updates. Its extensible design reduces duplicated work, minimizes errors, and enables efficient large-scale processing. By accommodating algorithms implemented in multiple languages and allowing parameter tuning for consistent feature extraction, Cosmodoit provides a versatile and practical tool for both research and development in music performance analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Cosmodoit is a new Python package that unifies performance-to-score alignment and feature extraction with dependency tracking and incremental updates, but the write-up stays mostly at the design level.

read the letter

Cosmodoit is a Python package that pulls performance-to-score alignment together with symbolic and audio feature extraction into one pipeline. It adds selective processing, dependency-aware computation, and incremental updates so users can avoid re-running everything when only part of the data changes. That setup directly tackles the scattered tools problem the abstract describes, where algorithms live in different languages and formats. The modular and extensible design is a clear practical step forward for people who run repeated analyses on performed music data. It also allows parameter tuning across components, which should help keep feature extraction consistent. Those are real engineering wins for the music information retrieval workflow. The soft spot is the missing validation. The paper claims reduced duplicated work, fewer errors, and efficient large-scale processing, yet it gives no benchmarks, timing results, or accuracy checks against standalone tools. Without those numbers it is difficult to judge whether the integration actually delivers on speed or reliability, or whether multi-language bridging introduces overhead. The central design claim holds up on paper, but the evidence is thin. This work is aimed at researchers who build custom pipelines for music performance analysis and want a single Python entry point. It is the sort of software contribution that can save time for others if the implementation is solid. I would send it to peer review rather than desk reject, mainly so referees can check the code, tests, and any performance data that may be in the full manuscript.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces Cosmodoit, a Python package for streamlining computational analysis of performed music. It integrates performance-to-score alignment with symbolic and audio feature extraction via a modular, flexible pipeline supporting selective processing, dependency-aware computation, and incremental updates. The design is claimed to accommodate multi-language algorithms, reduce duplicated work, minimize errors, and enable efficient large-scale processing while allowing parameter tuning.

Significance. If the described architecture is realized without significant compatibility or accuracy overhead, the package could offer a practical contribution to music information retrieval by unifying disparate tools and supporting reproducible, incremental workflows. The emphasis on modularity, dependency tracking, and extensibility is a clear strength for handling heterogeneous existing algorithms.

major comments (2)

[Abstract] Abstract: The central claims that the extensible design 'reduces duplicated work, minimizes errors, and enables efficient large-scale processing' are presented without any supporting benchmarks, timing results, error analysis, usage examples, or case studies demonstrating these benefits over existing separate tools.
[Abstract] The manuscript provides no details on how multi-language components are integrated (e.g., via wrappers, APIs, or data format conversions) or on mechanisms for maintaining accuracy and avoiding performance overhead, which are load-bearing for the 'versatile and practical tool' claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive review and positive assessment of the package's potential contribution to music information retrieval. We address each major comment below and indicate the revisions planned for the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The central claims that the extensible design 'reduces duplicated work, minimizes errors, and enables efficient large-scale processing' are presented without any supporting benchmarks, timing results, error analysis, usage examples, or case studies demonstrating these benefits over existing separate tools.

Authors: We agree that the abstract presents these benefits as design outcomes without quantitative or illustrative support in the current manuscript. The text focuses on architectural features such as selective processing, dependency-aware computation, and incremental updates, but does not demonstrate their impact through examples or measurements. In the revised version we will add a usage example section showing a concrete pipeline workflow on a small performed-music dataset, highlighting reduced manual steps and avoided duplication. We will also include basic timing comparisons for full versus selective/incremental runs. The abstract will be updated to reference these additions rather than stating the benefits unconditionally. revision: yes
Referee: [Abstract] The manuscript provides no details on how multi-language components are integrated (e.g., via wrappers, APIs, or data format conversions) or on mechanisms for maintaining accuracy and avoiding performance overhead, which are load-bearing for the 'versatile and practical tool' claim.

Authors: The referee is correct that the manuscript lacks concrete implementation details on multi-language support. While the abstract notes accommodation of algorithms in multiple languages, no description of wrappers, data exchange, accuracy safeguards, or overhead mitigation is provided. We will expand the manuscript with a new subsection describing the current integration strategy (Python subprocess calls and lightweight wrappers for external binaries, standardized interchange via JSON and MusicXML, and built-in checksum validation for feature consistency). We will also explain how the dependency graph and caching layer limit redundant cross-language calls, thereby addressing performance concerns. These additions will directly support the versatility claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity in software package description

full rationale

The paper is a direct description of the Cosmodoit Python package architecture for integrating performance-to-score alignment with symbolic and audio feature extraction. It presents design choices such as modularity, selective processing, dependency-aware computation, and incremental updates as engineering features that reduce duplicated work. No derivations, equations, predictions, fitted parameters, or self-citation chains appear in the text. The central claims are about the practical benefits of the pipeline design and do not reduce to any inputs by construction. This matches the expected profile of a tool-contribution paper with no load-bearing mathematical steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a description of a software development project rather than a mathematical or empirical scientific result. There are no free parameters, mathematical axioms, or invented scientific entities; the central claim is the existence and design of the package itself.

pith-pipeline@v0.9.0 · 5473 in / 1221 out tokens · 80428 ms · 2026-05-07T12:48:20.994772+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 6 canonical work pages

[1]

An Interdisciplinary Review of Music Performance Analy- sis,

Lerch A, Arthur C, Pari A, Gururani S (2020). An Interdisciplinary Review of Music Performance Analysis. Transactions of the International Society for Music Information Retrieval, 3(1): 221–245. doi: 10.5334/tismir.53

work page doi:10.5334/tismir.53 2020
[2]

What is musical prosody? Psychology of Learning and Motivation 46: 245-278

Palmer C, Hutchins S (2006). What is musical prosody? Psychology of Learning and Motivation 46: 245-278. doi: 10.1016/S0079-7421(06)46007-2

work page doi:10.1016/s0079-7421(06)46007-2 2006
[3]

End-to-end Bayesian segmentation and similarity assessment of performed music tempo and dynamics without score information

Guichaoua C, Lascabettes P, Chew E (2024). End-to-end Bayesian segmentation and similarity assessment of performed music tempo and dynamics without score information. Music & Science 7. doi: 10.1177/2059204324123341

work page doi:10.1177/2059204324123341 2024
[4]

A Computational Method for Empirically Validating Synchronisation Between Musical Phrase Arcs and Autonomic Variables

Cotic N, Pope V, Soliński M, Lambiase P, Chew E (2024). A Computational Method for Empirically Validating Synchronisation Between Musical Phrase Arcs and Autonomic Variables. In Proceedings of Computing in Cardiology (CinC2024), Karlsruhe, Germany

2024
[5]

A Matlab Toolbox to Compute Similarity from Audio

Pampalk E (2004). A Matlab Toolbox to Compute Similarity from Audio. In Proceedings of the ISMIR International Conference on Music Information Retrieval (ISMIR'04), Barcelona, Spain. https://www.pampalk.at/ma (accessed 17-September-2024)

2004
[6]

https://ampact.tumblr.com (accessed 17-September-2024)

Automated Music Performance Analysis and Comparison Toolkit (AMPACT) 0.3. https://ampact.tumblr.com (accessed 17-September-2024)

2024
[7]

Performance Error Detection and Post-Processing for Fast and Accurate Symbolic Music Alignment

Eita Nakamura, Kazuyoshi Yoshii, Haruhiro Katayose. Performance Error Detection and Post-Processing for Fast and Accurate Symbolic Music Alignment. In Proc. ISMIR, pp. 347-353, 2017. https://midialignment.github.io/demo.html (accessed 18-September-2024)

2017
[8]

Midi Miner -- A Python library for tonal tension and track classification

Guo R, Herremans D, Magnusson T (2019). Midi Miner -- A Python library for tonal tension and track classification. arXiv:1910.02049. https://github.com/ruiguo-bio/midi-miner (accessed 17-September-2024)

work page arXiv 2019
[9]

Automatic Note-Level Scoreto-Performance Alignments in the ASAP Dataset

Peter SD, Cancino-Chacón CE, Foscarin F, McLeod AP, Henkel F, Karystinaios E, WidmerG (2023). Automatic Note-Level Scoreto-Performance Alignments in the ASAP Dataset. Transactions of the International Society for Music Information Retrieval, 6(1), 27–42. doi: 10.5334/tismir.149. https://github.com/sildater/parangonar (accessed 17-September-2024)

work page doi:10.5334/tismir.149 2023
[10]

DoIt automation tool, 2008–

Schettino, E (2018). DoIt automation tool, 2008–. http://pydoit.org (accessed 31-August-2024)

2018
[11]

Tension ribbons: Quantifying and visualising tonal tension

Herremans D, Chew E. Tension ribbons: Quantifying and visualising tonal tension. In Proceedings of the Second International Conference on Technologies for Music Notation and Representation (TENOR) (Vol. 2, p. 8-18). Cambridge, UK. https://dorienherremans.com/tension (accessed 17-September-2024)

2024
[12]

Annotation and Analysis of Recorded Piano Performances on the Web

Fyfe L, Bedoya D, Chew E. Annotation and Analysis of Recorded Piano Performances on the Web. Journal of the Audio Engineering Society, 70(11): 962–978. doi: 10.17743/jaes.2022.0057. https://cosmonote.isd.kcl.ac.uk (accessed 18-September-2024)

work page doi:10.17743/jaes.2022.0057 2022