pith. machine review for the scientific record. sign in

arxiv: 2604.11560 · v1 · submitted 2026-04-13 · 💻 cs.LG · cs.AI

Recognition: unknown

bacpipe: a Python package to make bioacoustic deep learning models accessible

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:08 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords bioacousticsdeep learningPython packagepassive acoustic monitoringacoustic embeddingsmodel accessibilityecology softwareevaluation pipelines
0
0 comments X

The pith

Bacpipe is a Python package that lets ecologists and computer scientists apply state-of-the-art bioacoustic deep learning models to their own audio datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents bacpipe as a modular software package intended to make advanced deep learning tools for analyzing natural sounds available to a broad audience. It streamlines the process of running models on custom audio data to produce feature vectors and predictions while adding built-in options for evaluation. The authors argue that easier access will let researchers pursue new ecological and evolutionary questions using the millions of hours of passive acoustic recordings already collected.

Core claim

Bacpipe is a collection of bioacoustic deep learning models and evaluation pipelines accessible through a graphical and programming interface. It streamlines the usage of state-of-the-art models on custom audio datasets, generating acoustic feature vectors and classifier predictions. A modular design allows evaluation and benchmarking of models through interactive visualizations, clustering and probing.

What carries the argument

The bacpipe package, which functions as a modular convergence point integrating bioacoustic models with user-friendly interfaces for data processing, embedding generation, prediction, and evaluation pipelines.

If this is right

  • Users can generate acoustic feature vectors and classifier predictions from their own custom audio datasets using current state-of-the-art models.
  • Models can be evaluated and compared through interactive visualizations, clustering, and probing without separate coding.
  • Both ecologists and computer scientists gain direct access to deep-learning advances in bioacoustics.
  • Researchers can use the package to explore new ecological and evolutionary questions based on large acoustic monitoring collections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Widespread use could speed up analysis of the large existing archives of passive acoustic recordings.
  • The package could serve as a shared platform that lowers the barrier for interdisciplinary teams to test new models quickly.
  • Future updates might add automated checks for how well models generalize to different recording environments.

Load-bearing premise

That providing a modular interface and ready pipelines will be enough for ecologists and computer scientists to use the models effectively without running into extra technical barriers or needing separate validation of model performance on their new data.

What would settle it

A test in which ecologists with no deep-learning background load their own audio files into bacpipe, generate embeddings and predictions, run the clustering and visualization tools, and report whether the process succeeds without extra coding or debugging.

read the original abstract

1. Natural sounds have been recorded for millions of hours over the previous decades using passive acoustic monitoring. Improvements in deep learning models have vastly accelerated the analysis of large portions of this data. While new models advance the state-of-the-art, accessing them using tools to harness their full potential is not always straightforward. Here we present bacpipe, a collection of bioacoustic deep learning models and evaluation pipelines accessible through a graphical and programming interface, designed for both ecologists and computer scientists. Bacpipe is a modular software package intended as a point of convergence for bioacoustic models. 2. Bacpipe streamlines the usage of state-of-the-art models on custom audio datasets, generating acoustic feature vectors (embeddings) and classifier predictions. A modular design allows evaluation and benchmarking of models through interactive visualizations, clustering and probing. 3. We believe that access to new deep learning models is important. By designing bacpipe to target a wide audience, researchers will be enabled to answer new ecological and evolutionary questions in bioacoustics. 4. In conclusion, we believe accessibility to developments in deep learning to a wider audience benefits the ecological questions we are trying to answer.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces bacpipe, a Python package providing a modular collection of bioacoustic deep learning models accessible via both graphical and programming interfaces. It claims to streamline application of state-of-the-art models to custom audio datasets for generating embeddings and classifier predictions, while enabling evaluation, benchmarking, interactive visualizations, clustering, and probing for ecologists and computer scientists.

Significance. If the package delivers on its accessibility promises without introducing new technical barriers, it could meaningfully accelerate bioacoustics research by allowing wider use of advanced models on passive acoustic monitoring data, supporting new ecological and evolutionary questions. The modular design is a positive feature for extensibility across user groups.

major comments (2)
  1. [Abstract] Abstract paragraph 2: the central claim that bacpipe 'streamlines the usage of state-of-the-art models on custom audio datasets' and enables researchers to 'harness their full potential' rests on an untested assumption that the modular GUI/programming interface plus pipelines will suffice; the manuscript provides no user studies, error-rate measurements on out-of-domain audio, or comparisons against existing tools (e.g., BirdNET wrappers) to demonstrate reduced barriers for non-experts.
  2. [Abstract] Abstract: no benchmarks, validation results, or quantification of how often users still encounter model-specific preprocessing or embedding issues are reported, leaving the accessibility and benchmarking claims unsubstantiated despite their load-bearing role for the paper's contribution.
minor comments (1)
  1. [Abstract] Paragraphs 3 and 4 repeat similar statements about the importance of accessibility without adding distinct content or specific examples of enabled research questions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on the manuscript. We agree that the abstract's claims regarding streamlined usage and accessibility would benefit from more cautious phrasing, as the current version does not include empirical validation such as user studies or benchmarks. We have revised the abstract accordingly to better align the language with the manuscript's scope as a software description paper. Below we address each major comment in turn.

read point-by-point responses
  1. Referee: [Abstract] Abstract paragraph 2: the central claim that bacpipe 'streamlines the usage of state-of-the-art models on custom audio datasets' and enables researchers to 'harness their full potential' rests on an untested assumption that the modular GUI/programming interface plus pipelines will suffice; the manuscript provides no user studies, error-rate measurements on out-of-domain audio, or comparisons against existing tools (e.g., BirdNET wrappers) to demonstrate reduced barriers for non-experts.

    Authors: We acknowledge that the manuscript does not contain user studies, error-rate measurements on out-of-domain data, or head-to-head comparisons with tools such as BirdNET wrappers. The paper's contribution is the design and implementation of a modular package rather than an empirical evaluation of its impact on user barriers. We have revised the abstract to replace stronger claims of 'streamlining' and 'harnessing full potential' with statements that describe the provided interfaces and pipelines as designed to facilitate access and use for both ecologists and computer scientists, without asserting proven reductions in barriers. revision: yes

  2. Referee: [Abstract] Abstract: no benchmarks, validation results, or quantification of how often users still encounter model-specific preprocessing or embedding issues are reported, leaving the accessibility and benchmarking claims unsubstantiated despite their load-bearing role for the paper's contribution.

    Authors: The manuscript introduces evaluation and benchmarking pipelines as part of bacpipe's functionality but does not itself report specific benchmarks, validation results, or statistics on residual preprocessing issues. This is consistent with the paper being a software tool description rather than a benchmarking study. We have revised the abstract to clarify that the package supplies modular components enabling users to perform such evaluations and address model-specific issues, while removing language that could be read as claiming the package has already been shown to eliminate those issues. revision: yes

Circularity Check

0 steps flagged

No circularity: software package description with no derivations or predictions

full rationale

This manuscript is a software description paper presenting the bacpipe Python package for bioacoustic deep learning models. It contains no equations, no fitted parameters, no predictions, and no derivation chain of any kind. The central claims concern the modular design, interfaces, and intended accessibility benefits, which are presented as design choices rather than results derived from prior steps within the paper. No self-citation load-bearing arguments, ansatz smuggling, or renaming of known results occur. The paper is self-contained as a tool announcement and does not reduce any claim to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the assumption that a modular software wrapper will meaningfully converge and simplify access to existing models without introducing new technical requirements or validation needs.

pith-pipeline@v0.9.0 · 5518 in / 1089 out tokens · 38882 ms · 2026-05-10T15:08:46.707875+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references · 25 canonical work pages

  1. [1]

    Wall, Samara M

    Carrie C. Wall, Samara M. Haver, Leila T. Hatch, Jennifer Miksis-Olds, Rob Bochenek, Robert P . Dziak, and Jason Gedamke. The Next Wave of Passive Acoustic Data Management: How Centralized Access Can Enhance Science.Frontiers in Marine Science, 8, July 2021. ISSN 2296-7745. doi:10.3389/fmars.2021.703682

  2. [2]

    Baumgartner and Sarah E

    Mark F . Baumgartner and Sarah E. Mussoline. A generalized baleen whale call detection and classification system.The Journal of the Acoustical Society of America, 129(5):2889–2902, May 2011. ISSN 0001-4966. doi:10.1121/1.3562166

  3. [3]

    Acoustic indices for biodiversity assessment and landscape investigation.Acta Acustica united with Acustica, 100(4):772–781, 2014

    Jérôme Sueur, Almo Farina, Amandine Gasc, Nadia Pieretti, and Sandrine Pavoine. Acoustic indices for biodiversity assessment and landscape investigation.Acta Acustica united with Acustica, 100(4):772–781, 2014

  4. [4]

    van Rensburg

    Daniella T eixeira, Martine Maron, and Berndt J. van Rensburg. Bioacoustic monitoring of animal vocal behavior for conservation.Conservation Science and Practice, 1(8):e72, 2019. ISSN 2578-4854. doi:10.1111/csp2.72

  5. [5]

    Geographic Variation in Acoustic Signals in Wildlife: A System- atic Review.Journal of Biogeography, 52(6):e15116, 2025

    Esther Sebastián-González and Cristian Pérez-Granados. Geographic Variation in Acoustic Signals in Wildlife: A System- atic Review.Journal of Biogeography, 52(6):e15116, 2025. ISSN 1365-2699. doi:10.1111/jbi.15116

  6. [6]

    Camille Desjonquères, Simon Linke, Jack Greenhalgh, Fanny Rybak, and Jérôme Sueur. The potential of acoustic moni- toring of aquatic insects for freshwater assessment.Philosophical Transactions of the Royal Society B: Biological Sciences, 379(1904):20230109, May 2024. doi:10.1098/rstb.2023.0109

  7. [7]

    Seyfarth and Dorothy L

    Robert M. Seyfarth and Dorothy L. Cheney. Production, usage, and comprehension in animal vocalizations.Brain and Language, 115(1):92–100, October 2010. ISSN 0093-934X. doi:10.1016/j.bandl.2009.10.003

  8. [8]

    Mellinger, Kathleen M

    David K. Mellinger, Kathleen M. Stafford, Sue E. Moore, Robert P . Dziak, and Haru Matsumoto. An overview of fixed passive acoustic observation methods for cetaceans.Oceanography, 20(4):36–45, 2007

  9. [9]

    Calupca, Kurt M

    Thomas A. Calupca, Kurt M. Fristrup, and Christopher W. Clark. A compact digital recording system for autonomous bioacoustic monitoring.The Journal of the Acoustical Society of America, 108(5_Supplement):2582, November 2000. ISSN 0001-4966. doi:10.1121/1.4743595

  10. [10]

    T errestrial Pas- sive Acoustic Monitoring: Review and Perspectives.BioScience, 69(1):15–25, January 2019

    Larissa Sayuri Moreira Sugai, Thiago Sanna Freire Silva, José Wagner Ribeiro, Jr, and Diego Llusia. T errestrial Pas- sive Acoustic Monitoring: Review and Perspectives.BioScience, 69(1):15–25, January 2019. ISSN 0006-3568. doi:10.1093/biosci/biy147. Kather et al. 15

  11. [11]

    Computational bioacoustics with deep learning: A review and roadmap.PeerJ, 10:e13152, March 2022

    Dan Stowell. Computational bioacoustics with deep learning: A review and roadmap.PeerJ, 10:e13152, March 2022. ISSN 2167-8359. doi:10.7717/peerj.13152

  12. [12]

    Foundation Models for Bioacoustics – a Comparative Review, August 2025

    Raphael Schwinger, Paria Vali Zadeh, Lukas Rauch, Mats Kurz, T om Hauschild, Sam Lapp, and Sven T omforde. Foundation Models for Bioacoustics – a Comparative Review, August 2025

  13. [13]

    What Matters for Bioacoustic Encoding, August 2025

    Marius Miron, David Robinson, Milad Alizadeh, Ellen Gilsenan-McMahon, Gagan Narula, Emmanuel Chemla, Maddie Cusimano, Felix Effenberger, Masato Hagiwara, Benjamin Hoffman, Sara Keen, Diane Kim, Jane Lawton, Jen-Yu Liu, Aza Raskin, Olivier Pietquin, and Matthieu Geist. What Matters for Bioacoustic Encoding, August 2025

  14. [14]

    Scikit-maad: An open- source and modular toolbox for quantitative soundscape analysis in Python.Methods in Ecology and Evolution, 12(12): 2334–2340, 2021

    Juan Sebastián Ulloa, Sylvain Haupert, Juan Felipe Latorre, Thierry Aubin, and Jérôme Sueur. Scikit-maad: An open- source and modular toolbox for quantitative soundscape analysis in Python.Methods in Ecology and Evolution, 12(12): 2334–2340, 2021. ISSN 2041-210X. doi:10.1111/2041-210X.13711

  15. [15]

    Unsupervised classification to improve the quality of a bird song recording dataset.Ecological Informatics, 74:101952, May 2023

    Félix Michaud, Jérôme Sueur, Maxime Le Cesne, and Sylvain Haupert. Unsupervised classification to improve the quality of a bird song recording dataset.Ecological Informatics, 74:101952, May 2023. ISSN 1574-9541. doi:10.1016/j.ecoinf.2022.101952

  16. [16]

    OpenSound- scape: An open-source bioacoustics analysis package for Python.Methods in Ecology and Evolution, 14(9):2321–2328,

    Sam Lapp, T essa Rhinehart, Louis Freeland-Haynes, Jatin Khilnani, Alexandra Syunkova, and Justin Kitzes. OpenSound- scape: An open-source bioacoustics analysis package for Python.Methods in Ecology and Evolution, 14(9):2321–2328,

  17. [17]

    doi:10.1111/2041-210X.14196

    ISSN 2041-210X. doi:10.1111/2041-210X.14196

  18. [18]

    sound-scape-explorer, January 2026

    Sound-scape-explorer/sound-scape-explorer. sound-scape-explorer, January 2026

  19. [19]

    Pykanto: A python library to accelerate research on wild bird song.Methods in Ecology and Evolution, 14(8):1994–2002, August 2023

    Nilo Merino Recalde. Pykanto: A python library to accelerate research on wild bird song.Methods in Ecology and Evolution, 14(8):1994–2002, August 2023. ISSN 2041-210X, 2041-210X. doi:10.1111/2041-210X.14155

  20. [20]

    Birds, bats and be- yond: Evaluating generalization in bioacoustics models.Frontiers in Bird Science, 3, July 2024

    Bart van Merriënboer, Jenny Hamer, Vincent Dumoulin, Eleni T riantafillou, and T om Denton. Birds, bats and be- yond: Evaluating generalization in bioacoustics models.Frontiers in Bird Science, 3, July 2024. ISSN 2813-3870. doi:10.3389/fbirs.2024.1369756

  21. [21]

    Wood, Maximilian Eibl, and Holger Klinck

    Stefan Kahl, Connor M. Wood, Maximilian Eibl, and Holger Klinck. BirdNET: A deep learning solution for avian diversity monitoring.Ecological Informatics, 61:101236, March 2021. ISSN 15749541. doi:10.1016/j.ecoinf.2021.101236

  22. [22]

    BirdNET: Applications, performance, pitfalls and future opportunities.Ibis, 165(3):1068–1075,

    Cristian Pérez-Granados. BirdNET: Applications, performance, pitfalls and future opportunities.Ibis, 165(3):1068–1075,

  23. [23]

    doi:10.1111/ibi.13193

    ISSN 1474-919X. doi:10.1111/ibi.13193

  24. [24]

    Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell

    Burooj Ghani, Vincent J. Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell. Generalization in birdsong classification: Impact of transfer learning methods and dataset characteristics, September 2024

  25. [25]

    Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell

    Burooj Ghani, Vincent J. Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell. Impact of transfer learning methods and dataset characteristics on generalization in birdsong classification.Scientific Reports, 15(1):16273, May 2025. ISSN 2045-2322. doi:10.1038/s41598-025-00996-2

  26. [26]

    The use of BirdNET embeddings as a fast solution to find novel sound classes in audio recordings.Frontiers in Ecology and Evolution, 12, January 2025

    Slade Allen-Ankins, Sebastian Hoefer, Jacopo Bartholomew, Sheryn Brodie, and Lin Schwarzkopf. The use of BirdNET embeddings as a fast solution to find novel sound classes in audio recordings.Frontiers in Ecology and Evolution, 12, January 2025. ISSN 2296-701X. doi:10.3389/fevo.2024.1409407

  27. [27]

    BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics, December 2023

    Jenny Hamer, Eleni T riantafillou, Bart van Merriënboer, Stefan Kahl, Holger Klinck, T om Denton, and Vincent Dumoulin. BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics, December 2023

  28. [28]

    Xeno-canto :: Sharing wildlife sounds from around the world

    xeno-canto. Xeno-canto :: Sharing wildlife sounds from around the world. https:/ /xeno-canto.org/, 2025

  29. [29]

    AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring, July 2023

    Juan Sebastián Cañas, Maria Paula T oro-Gómez, Larissa Sayuri Moreira Sugai, Hernán Darío Benítez Restrepo, Jorge Rudas, Breyner Posso Bautista, Luís Felipe T oledo, Simone Dena, Adão Henrique Rosa Domingos, Franco Leandro de Souza, Selvino Neckel-Oliveira, Anderson da Rosa, Vítor Carvalho-Rocha, José Vinícius Bernardy, José Luiz Mas- sao Moreira Sugai, C...

  30. [30]

    Schäfer-Zimmermann, Vlad Demartsev, Baptiste Averly, Kiran Dhanjal-Adams, Mathieu Duteil, Gabriella Gall, Marius Faiß, Lily Johnson-Ulrich, Dan Stowell, Marta B

    Julian C. Schäfer-Zimmermann, Vlad Demartsev, Baptiste Averly, Kiran Dhanjal-Adams, Mathieu Duteil, Gabriella Gall, Marius Faiß, Lily Johnson-Ulrich, Dan Stowell, Marta B. Manser, Marie A. Roch, and Ariana Strandburg-Peshkin. Ani- mal2vec and MeerKAT: A self-supervised transformer for rare-event raw audio input and a large-scale reference dataset for bioa...

  31. [31]

    The iNaturalist Sounds Dataset

    Mustafa Chasmai, Alexander Shepard, Subhransu Maji, and Grant Van Horn. The iNaturalist Sounds Dataset. InThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, November 2024

  32. [32]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, September 2020

    Leland McInnes, John Healy, and James Melville. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, September 2020

  33. [33]

    Raven Pro: Interactive sound analysis software

    Cornell Lab of Ornithology. Raven Pro: Interactive sound analysis software. Version 1.5, 2014

  34. [34]

    Masked Autoencoders that Listen.Advances in Neural Information Processing Systems, 35:28708–28720, December 2022

    Po-Y ao Huang, Hu Xu, Juncheng Li, Alexei Baevski, Michael Auli, Wojciech Galuba, Florian Metze, and Christoph Fe- ichtenhofer. Masked Autoencoders that Listen.Advances in Neural Information Processing Systems, 35:28708–28720, December 2022

  35. [35]

    AudioProtoPNet: An interpretable deep learning model for bird sound classification.Ecological Informatics, 87: 103081, 2025

    René Heinrich, Lukas Rauch, Bernhard Sick, and Christoph Scholz. AudioProtoPNet: An interpretable deep learning model for bird sound classification.Ecological Informatics, 87:103081, July 2025. ISSN 1574-9541. doi:10.1016/j.ecoinf.2025.103081

  36. [36]

    AVES: Animal Vocalization Encoder based on Self-Supervision, October 2022

    Masato Hagiwara. AVES: Animal Vocalization Encoder based on Self-Supervision, October 2022

  37. [37]

    BEAT s: Audio Pre-T raining with Acoustic T okenizers, December 2022

    Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Daniel T ompkins, Zhuo Chen, and Furu Wei. BEAT s: Audio Pre-T raining with Acoustic T okenizers, December 2022

  38. [38]

    T ransferable Models for Bioacoustics with Human Lan- guage Supervision, August 2023

    David Robinson, Adelaide Robinson, and Lily Akrapongpisak. T ransferable Models for Bioacoustics with Human Lan- guage Supervision, August 2023

  39. [39]

    Can Masked Autoen- coders Also Listen to Birds?, August 2025

    Lukas Rauch, René Heinrich, Ilyass Moummad, Alexis Joly, Bernhard Sick, and Christoph Scholz. Can Masked Autoen- coders Also Listen to Birds?, August 2025

  40. [40]

    Development of a machine learning detector for North Atlantic humpback whale song.The Journal of the Acoustical Society of America, 155(3):2050–2064, March 2024

    Vincent Kather, Fabian Seipel, Benoit Berges, Genevieve Davis, Catherine Gibson, Matt Harvey, Lea-Anne Henry, Andrew Stevenson, and Denise Risch. Development of a machine learning detector for North Atlantic humpback whale song.The Journal of the Acoustical Society of America, 155(3):2050–2064, March 2024. ISSN 0001-4966. doi:10.1121/10.0025275

  41. [41]

    Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds, March 2024

    Ilyass Moummad, Nicolas Farrugia, Romain Serizel, Jeremy Froidevaux, and Vincent Lostanlen. Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds, March 2024

  42. [42]

    NatureLM-audio: An Audio-Language Founda- tion Model for Bioacoustics, November 2024

    David Robinson, Marius Miron, Masato Hagiwara, and Olivier Pietquin. NatureLM-audio: An Audio-Language Founda- tion Model for Bioacoustics, November 2024

  43. [43]

    Global birdsong embeddings enable superior trans- fer learning for bioacoustic classification.Scientific Reports, 13(1):22876, December 2023

    Burooj Ghani, T om Denton, Stefan Kahl, and Holger Klinck. Global birdsong embeddings enable superior trans- fer learning for bioacoustic classification.Scientific Reports, 13(1):22876, December 2023. ISSN 2045-2322. doi:10.1038/s41598-023-49989-z

  44. [44]

    Perch 2.0: The Bittern Lesson for Bioacoustics, August 2025

    Bart van Merriënboer, Vincent Dumoulin, Jenny Hamer, Lauren Harrell, Andrea Burns, and T om Denton. Perch 2.0: The Bittern Lesson for Bioacoustics, August 2025

  45. [45]

    Domain-Invariant Representation Learning of Bird Sounds

    Ilyass Moummad, Romain Serizel, Emmanouil Benetos, and Nicolas Farrugia. Domain-Invariant Representation Learning of Bird Sounds. September 2024

  46. [46]

    Regularized Contrastive Pre-training for Few-shot Bioacoustic Sound Detection, January 2024

    Ilyass Moummad, Romain Serizel, and Nicolas Farrugia. Regularized Contrastive Pre-training for Few-shot Bioacoustic Sound Detection, January 2024. Kather et al. 17

  47. [47]

    Fleishman, Matthew McKown, Jill E

    Ben Williams, Bart van Merriënboer, Vincent Dumoulin, Jenny Hamer, Eleni T riantafillou, Abram B. Fleishman, Matthew McKown, Jill E. Munger, Aaron N. Rice, Ashlee Lillis, Clemency E. White, Catherine A. D. Hobbs, T ries B. Razak, Kate E. Jones, and T om Denton. Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bio...

  48. [48]

    Shawn Hershey, Sourish Chaudhuri, Daniel P . W. Ellis, Jort F . Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, and Kevin Wilson. CNN Architectures for Large-Scale Audio Classification, January 2017

  49. [49]

    Full-Band General Audio Synthesis with Score-Based Diffusion

    Masato Hagiwara, Benjamin Hoffman, Jen-Yu Liu, Maddie Cusimano, Felix Effenberger, and Katie Zacarian. BEANS: The Benchmark of Animal Sounds. InICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5, June 2023. doi:10.1109/ICASSP49357.2023.10096686

  50. [50]

    Understanding intermediate layers using linear classifier probes, November 2018

    Guillaume Alain and Y oshua Bengio. Understanding intermediate layers using linear classifier probes, November 2018

  51. [51]

    Kather, Burooj Ghani, and Dan Stowell

    Vincent S. Kather, Burooj Ghani, and Dan Stowell. Clustering and Novel Class Recognition: Evaluating Bioacoustic Deep Learning Feature Extractors. InProceedings of the 11th Convention of the European Acoustics Association Forum Acusticum / EuroNoise 2025, 2025. doi:10.61782/fa.2025.0231

  52. [52]

    Standardized Mutual Information for Clustering Com- parisons: One Step Further in Adjustment for Chance

    Simone Romano, James Bailey, Vinh Nguyen, and Karin Verspoor. Standardized Mutual Information for Clustering Com- parisons: One Step Further in Adjustment for Chance. InProceedings of the 31st International Conference on Machine Learning, pages 1143–1151. PMLR, June 2014

  53. [53]

    Brusco, and Lawrence Hubert

    Douglas Steinley, Michael J. Brusco, and Lawrence Hubert. The variance of the adjusted Rand index.Psychological Methods, 21(2):261–272, 2016. ISSN 1939-1463. doi:10.1037/met0000049

  54. [54]

    Plotly: Create Interactive Web Graphics via ’plotly.js’ , November 2015

    Carson Sievert, Chris Parmer, T oby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. Plotly: Create Interactive Web Graphics via ’plotly.js’ , November 2015

  55. [55]

    Visualizing data using t-SNE.Journal of machine learning research, 9(11), 2008

    Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-SNE.Journal of machine learning research, 9(11), 2008

  56. [56]

    Principal component analysis.Chemometrics and Intelligent Laboratory Systems, 2(1):37–52, August 1987

    Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis.Chemometrics and Intelligent Laboratory Systems, 2(1):37–52, August 1987. ISSN 0169-7439. doi:10.1016/0169-7439(87)80084-9. 18 Kather et al. 10|APPENDIX In this section further materials are provided. TA B L E 2Selection of API functions ofbacpipe. All pipelines can be used with the ...