Recognition: unknown
Cosmodoit: A Python Package for Adaptive, Efficient Pipelining of Feature Extraction from Performed Music
Pith reviewed 2026-05-07 12:48 UTC · model grok-4.3
The pith
Cosmodoit is a Python package that integrates performance-to-score alignment with symbolic and audio feature extraction in one modular pipeline.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Cosmodoit integrates performance-to-score alignment with symbolic and audio feature extraction in a modular, flexible pipeline that supports selective processing, dependency-aware computation, and incremental updates. Its extensible design reduces duplicated work, minimizes errors, and enables efficient large-scale processing. By accommodating algorithms implemented in multiple languages and allowing parameter tuning for consistent feature extraction, Cosmodoit provides a versatile and practical tool for both research and development in music performance analysis.
What carries the argument
Cosmodoit, the Python package whose modular architecture integrates alignment and extraction tasks while managing dependencies and incremental recomputation.
Load-bearing premise
That existing algorithms implemented in multiple languages and data formats can be integrated into a single extensible pipeline without introducing significant compatibility issues, performance overhead, or loss of accuracy in feature extraction.
What would settle it
A controlled test on a fixed set of music performances where identical features are extracted once via Cosmodoit and once via separate manual scripts, then compared for exact match in outputs and reduction in total runtime.
Figures
read the original abstract
Computational analysis of performed music is a key component of music information research, as performance shapes much of the music we hear. Music performance analysis studies the acoustic variations introduced by performers and how these variations reflect musical interpretation and structure. Although many algorithms and tools exist for tasks such as performance-to-score alignment and symbolic or audio feature extraction, they are spread across different programming languages and data formats, making them difficult to combine efficiently. To address this problem, we present Cosmodoit, a novel Python package designed to streamline feature extraction from performed music. Cosmodoit integrates performance-to-score alignment with symbolic and audio feature extraction in a modular, flexible pipeline that supports selective processing, dependency-aware computation, and incremental updates. Its extensible design reduces duplicated work, minimizes errors, and enables efficient large-scale processing. By accommodating algorithms implemented in multiple languages and allowing parameter tuning for consistent feature extraction, Cosmodoit provides a versatile and practical tool for both research and development in music performance analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Cosmodoit, a Python package for streamlining computational analysis of performed music. It integrates performance-to-score alignment with symbolic and audio feature extraction via a modular, flexible pipeline supporting selective processing, dependency-aware computation, and incremental updates. The design is claimed to accommodate multi-language algorithms, reduce duplicated work, minimize errors, and enable efficient large-scale processing while allowing parameter tuning.
Significance. If the described architecture is realized without significant compatibility or accuracy overhead, the package could offer a practical contribution to music information retrieval by unifying disparate tools and supporting reproducible, incremental workflows. The emphasis on modularity, dependency tracking, and extensibility is a clear strength for handling heterogeneous existing algorithms.
major comments (2)
- [Abstract] Abstract: The central claims that the extensible design 'reduces duplicated work, minimizes errors, and enables efficient large-scale processing' are presented without any supporting benchmarks, timing results, error analysis, usage examples, or case studies demonstrating these benefits over existing separate tools.
- [Abstract] The manuscript provides no details on how multi-language components are integrated (e.g., via wrappers, APIs, or data format conversions) or on mechanisms for maintaining accuracy and avoiding performance overhead, which are load-bearing for the 'versatile and practical tool' claim.
Simulated Author's Rebuttal
We thank the referee for their constructive review and positive assessment of the package's potential contribution to music information retrieval. We address each major comment below and indicate the revisions planned for the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claims that the extensible design 'reduces duplicated work, minimizes errors, and enables efficient large-scale processing' are presented without any supporting benchmarks, timing results, error analysis, usage examples, or case studies demonstrating these benefits over existing separate tools.
Authors: We agree that the abstract presents these benefits as design outcomes without quantitative or illustrative support in the current manuscript. The text focuses on architectural features such as selective processing, dependency-aware computation, and incremental updates, but does not demonstrate their impact through examples or measurements. In the revised version we will add a usage example section showing a concrete pipeline workflow on a small performed-music dataset, highlighting reduced manual steps and avoided duplication. We will also include basic timing comparisons for full versus selective/incremental runs. The abstract will be updated to reference these additions rather than stating the benefits unconditionally. revision: yes
-
Referee: [Abstract] The manuscript provides no details on how multi-language components are integrated (e.g., via wrappers, APIs, or data format conversions) or on mechanisms for maintaining accuracy and avoiding performance overhead, which are load-bearing for the 'versatile and practical tool' claim.
Authors: The referee is correct that the manuscript lacks concrete implementation details on multi-language support. While the abstract notes accommodation of algorithms in multiple languages, no description of wrappers, data exchange, accuracy safeguards, or overhead mitigation is provided. We will expand the manuscript with a new subsection describing the current integration strategy (Python subprocess calls and lightweight wrappers for external binaries, standardized interchange via JSON and MusicXML, and built-in checksum validation for feature consistency). We will also explain how the dependency graph and caching layer limit redundant cross-language calls, thereby addressing performance concerns. These additions will directly support the versatility claim. revision: yes
Circularity Check
No significant circularity in software package description
full rationale
The paper is a direct description of the Cosmodoit Python package architecture for integrating performance-to-score alignment with symbolic and audio feature extraction. It presents design choices such as modularity, selective processing, dependency-aware computation, and incremental updates as engineering features that reduce duplicated work. No derivations, equations, predictions, fitted parameters, or self-citation chains appear in the text. The central claims are about the practical benefits of the pipeline design and do not reduce to any inputs by construction. This matches the expected profile of a tool-contribution paper with no load-bearing mathematical steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
An Interdisciplinary Review of Music Performance Analy- sis,
Lerch A, Arthur C, Pari A, Gururani S (2020). An Interdisciplinary Review of Music Performance Analysis. Transactions of the International Society for Music Information Retrieval, 3(1): 221–245. doi: 10.5334/tismir.53
-
[2]
What is musical prosody? Psychology of Learning and Motivation 46: 245-278
Palmer C, Hutchins S (2006). What is musical prosody? Psychology of Learning and Motivation 46: 245-278. doi: 10.1016/S0079-7421(06)46007-2
-
[3]
Guichaoua C, Lascabettes P, Chew E (2024). End-to-end Bayesian segmentation and similarity assessment of performed music tempo and dynamics without score information. Music & Science 7. doi: 10.1177/2059204324123341
-
[4]
A Computational Method for Empirically Validating Synchronisation Between Musical Phrase Arcs and Autonomic Variables
Cotic N, Pope V, Soliński M, Lambiase P, Chew E (2024). A Computational Method for Empirically Validating Synchronisation Between Musical Phrase Arcs and Autonomic Variables. In Proceedings of Computing in Cardiology (CinC2024), Karlsruhe, Germany
2024
-
[5]
A Matlab Toolbox to Compute Similarity from Audio
Pampalk E (2004). A Matlab Toolbox to Compute Similarity from Audio. In Proceedings of the ISMIR International Conference on Music Information Retrieval (ISMIR'04), Barcelona, Spain. https://www.pampalk.at/ma (accessed 17-September-2024)
2004
-
[6]
https://ampact.tumblr.com (accessed 17-September-2024)
Automated Music Performance Analysis and Comparison Toolkit (AMPACT) 0.3. https://ampact.tumblr.com (accessed 17-September-2024)
2024
-
[7]
Performance Error Detection and Post-Processing for Fast and Accurate Symbolic Music Alignment
Eita Nakamura, Kazuyoshi Yoshii, Haruhiro Katayose. Performance Error Detection and Post-Processing for Fast and Accurate Symbolic Music Alignment. In Proc. ISMIR, pp. 347-353, 2017. https://midialignment.github.io/demo.html (accessed 18-September-2024)
2017
-
[8]
Midi Miner -- A Python library for tonal tension and track classification
Guo R, Herremans D, Magnusson T (2019). Midi Miner -- A Python library for tonal tension and track classification. arXiv:1910.02049. https://github.com/ruiguo-bio/midi-miner (accessed 17-September-2024)
-
[9]
Automatic Note-Level Scoreto-Performance Alignments in the ASAP Dataset
Peter SD, Cancino-Chacón CE, Foscarin F, McLeod AP, Henkel F, Karystinaios E, WidmerG (2023). Automatic Note-Level Scoreto-Performance Alignments in the ASAP Dataset. Transactions of the International Society for Music Information Retrieval, 6(1), 27–42. doi: 10.5334/tismir.149. https://github.com/sildater/parangonar (accessed 17-September-2024)
-
[10]
DoIt automation tool, 2008–
Schettino, E (2018). DoIt automation tool, 2008–. http://pydoit.org (accessed 31-August-2024)
2018
-
[11]
Tension ribbons: Quantifying and visualising tonal tension
Herremans D, Chew E. Tension ribbons: Quantifying and visualising tonal tension. In Proceedings of the Second International Conference on Technologies for Music Notation and Representation (TENOR) (Vol. 2, p. 8-18). Cambridge, UK. https://dorienherremans.com/tension (accessed 17-September-2024)
2024
-
[12]
Annotation and Analysis of Recorded Piano Performances on the Web
Fyfe L, Bedoya D, Chew E. Annotation and Analysis of Recorded Piano Performances on the Web. Journal of the Audio Engineering Society, 70(11): 962–978. doi: 10.17743/jaes.2022.0057. https://cosmonote.isd.kcl.ac.uk (accessed 18-September-2024)
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.