arxiv: 2604.11560 · v1 · submitted 2026-04-13 · 💻 cs.LG · cs.AI

Recognition: unknown

bacpipe: a Python package to make bioacoustic deep learning models accessible

Vincent S. Kather , Sylvain Haupert , Burooj Ghani , Dan Stowell

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:08 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords bioacousticsdeep learningPython packagepassive acoustic monitoringacoustic embeddingsmodel accessibilityecology softwareevaluation pipelines

0 comments

The pith

Bacpipe is a Python package that lets ecologists and computer scientists apply state-of-the-art bioacoustic deep learning models to their own audio datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents bacpipe as a modular software package intended to make advanced deep learning tools for analyzing natural sounds available to a broad audience. It streamlines the process of running models on custom audio data to produce feature vectors and predictions while adding built-in options for evaluation. The authors argue that easier access will let researchers pursue new ecological and evolutionary questions using the millions of hours of passive acoustic recordings already collected.

Core claim

Bacpipe is a collection of bioacoustic deep learning models and evaluation pipelines accessible through a graphical and programming interface. It streamlines the usage of state-of-the-art models on custom audio datasets, generating acoustic feature vectors and classifier predictions. A modular design allows evaluation and benchmarking of models through interactive visualizations, clustering and probing.

What carries the argument

The bacpipe package, which functions as a modular convergence point integrating bioacoustic models with user-friendly interfaces for data processing, embedding generation, prediction, and evaluation pipelines.

If this is right

Users can generate acoustic feature vectors and classifier predictions from their own custom audio datasets using current state-of-the-art models.
Models can be evaluated and compared through interactive visualizations, clustering, and probing without separate coding.
Both ecologists and computer scientists gain direct access to deep-learning advances in bioacoustics.
Researchers can use the package to explore new ecological and evolutionary questions based on large acoustic monitoring collections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Widespread use could speed up analysis of the large existing archives of passive acoustic recordings.
The package could serve as a shared platform that lowers the barrier for interdisciplinary teams to test new models quickly.
Future updates might add automated checks for how well models generalize to different recording environments.

Load-bearing premise

That providing a modular interface and ready pipelines will be enough for ecologists and computer scientists to use the models effectively without running into extra technical barriers or needing separate validation of model performance on their new data.

What would settle it

A test in which ecologists with no deep-learning background load their own audio files into bacpipe, generate embeddings and predictions, run the clustering and visualization tools, and report whether the process succeeds without extra coding or debugging.

read the original abstract

1. Natural sounds have been recorded for millions of hours over the previous decades using passive acoustic monitoring. Improvements in deep learning models have vastly accelerated the analysis of large portions of this data. While new models advance the state-of-the-art, accessing them using tools to harness their full potential is not always straightforward. Here we present bacpipe, a collection of bioacoustic deep learning models and evaluation pipelines accessible through a graphical and programming interface, designed for both ecologists and computer scientists. Bacpipe is a modular software package intended as a point of convergence for bioacoustic models. 2. Bacpipe streamlines the usage of state-of-the-art models on custom audio datasets, generating acoustic feature vectors (embeddings) and classifier predictions. A modular design allows evaluation and benchmarking of models through interactive visualizations, clustering and probing. 3. We believe that access to new deep learning models is important. By designing bacpipe to target a wide audience, researchers will be enabled to answer new ecological and evolutionary questions in bioacoustics. 4. In conclusion, we believe accessibility to developments in deep learning to a wider audience benefits the ecological questions we are trying to answer.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Bacpipe is a new Python package bundling several bioacoustic DL models with a modular GUI-plus-code interface, but the paper gives no evidence that it actually lowers barriers for its target users.

read the letter

Bacpipe is a Python package that collects multiple deep learning models for bioacoustics and exposes them through both a graphical interface and standard code calls. The main pitch is that ecologists can now run these models on their own recordings to get embeddings or predictions without having to wire everything up themselves, and the modular design lets users swap models or add evaluation steps like clustering and visualizations.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces bacpipe, a Python package providing a modular collection of bioacoustic deep learning models accessible via both graphical and programming interfaces. It claims to streamline application of state-of-the-art models to custom audio datasets for generating embeddings and classifier predictions, while enabling evaluation, benchmarking, interactive visualizations, clustering, and probing for ecologists and computer scientists.

Significance. If the package delivers on its accessibility promises without introducing new technical barriers, it could meaningfully accelerate bioacoustics research by allowing wider use of advanced models on passive acoustic monitoring data, supporting new ecological and evolutionary questions. The modular design is a positive feature for extensibility across user groups.

major comments (2)

[Abstract] Abstract paragraph 2: the central claim that bacpipe 'streamlines the usage of state-of-the-art models on custom audio datasets' and enables researchers to 'harness their full potential' rests on an untested assumption that the modular GUI/programming interface plus pipelines will suffice; the manuscript provides no user studies, error-rate measurements on out-of-domain audio, or comparisons against existing tools (e.g., BirdNET wrappers) to demonstrate reduced barriers for non-experts.
[Abstract] Abstract: no benchmarks, validation results, or quantification of how often users still encounter model-specific preprocessing or embedding issues are reported, leaving the accessibility and benchmarking claims unsubstantiated despite their load-bearing role for the paper's contribution.

minor comments (1)

[Abstract] Paragraphs 3 and 4 repeat similar statements about the importance of accessibility without adding distinct content or specific examples of enabled research questions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on the manuscript. We agree that the abstract's claims regarding streamlined usage and accessibility would benefit from more cautious phrasing, as the current version does not include empirical validation such as user studies or benchmarks. We have revised the abstract accordingly to better align the language with the manuscript's scope as a software description paper. Below we address each major comment in turn.

read point-by-point responses

Referee: [Abstract] Abstract paragraph 2: the central claim that bacpipe 'streamlines the usage of state-of-the-art models on custom audio datasets' and enables researchers to 'harness their full potential' rests on an untested assumption that the modular GUI/programming interface plus pipelines will suffice; the manuscript provides no user studies, error-rate measurements on out-of-domain audio, or comparisons against existing tools (e.g., BirdNET wrappers) to demonstrate reduced barriers for non-experts.

Authors: We acknowledge that the manuscript does not contain user studies, error-rate measurements on out-of-domain data, or head-to-head comparisons with tools such as BirdNET wrappers. The paper's contribution is the design and implementation of a modular package rather than an empirical evaluation of its impact on user barriers. We have revised the abstract to replace stronger claims of 'streamlining' and 'harnessing full potential' with statements that describe the provided interfaces and pipelines as designed to facilitate access and use for both ecologists and computer scientists, without asserting proven reductions in barriers. revision: yes
Referee: [Abstract] Abstract: no benchmarks, validation results, or quantification of how often users still encounter model-specific preprocessing or embedding issues are reported, leaving the accessibility and benchmarking claims unsubstantiated despite their load-bearing role for the paper's contribution.

Authors: The manuscript introduces evaluation and benchmarking pipelines as part of bacpipe's functionality but does not itself report specific benchmarks, validation results, or statistics on residual preprocessing issues. This is consistent with the paper being a software tool description rather than a benchmarking study. We have revised the abstract to clarify that the package supplies modular components enabling users to perform such evaluations and address model-specific issues, while removing language that could be read as claiming the package has already been shown to eliminate those issues. revision: yes

Circularity Check

0 steps flagged

No circularity: software package description with no derivations or predictions

full rationale

This manuscript is a software description paper presenting the bacpipe Python package for bioacoustic deep learning models. It contains no equations, no fitted parameters, no predictions, and no derivation chain of any kind. The central claims concern the modular design, interfaces, and intended accessibility benefits, which are presented as design choices rather than results derived from prior steps within the paper. No self-citation load-bearing arguments, ansatz smuggling, or renaming of known results occur. The paper is self-contained as a tool announcement and does not reduce any claim to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the assumption that a modular software wrapper will meaningfully converge and simplify access to existing models without introducing new technical requirements or validation needs.

pith-pipeline@v0.9.0 · 5518 in / 1089 out tokens · 38882 ms · 2026-05-10T15:08:46.707875+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 25 canonical work pages

[1]

Wall, Samara M

Carrie C. Wall, Samara M. Haver, Leila T. Hatch, Jennifer Miksis-Olds, Rob Bochenek, Robert P . Dziak, and Jason Gedamke. The Next Wave of Passive Acoustic Data Management: How Centralized Access Can Enhance Science.Frontiers in Marine Science, 8, July 2021. ISSN 2296-7745. doi:10.3389/fmars.2021.703682

work page doi:10.3389/fmars.2021.703682 2021
[2]

Baumgartner and Sarah E

Mark F . Baumgartner and Sarah E. Mussoline. A generalized baleen whale call detection and classiﬁcation system.The Journal of the Acoustical Society of America, 129(5):2889–2902, May 2011. ISSN 0001-4966. doi:10.1121/1.3562166

work page doi:10.1121/1.3562166 2011
[3]

Acoustic indices for biodiversity assessment and landscape investigation.Acta Acustica united with Acustica, 100(4):772–781, 2014

Jérôme Sueur, Almo Farina, Amandine Gasc, Nadia Pieretti, and Sandrine Pavoine. Acoustic indices for biodiversity assessment and landscape investigation.Acta Acustica united with Acustica, 100(4):772–781, 2014

2014
[4]

van Rensburg

Daniella T eixeira, Martine Maron, and Berndt J. van Rensburg. Bioacoustic monitoring of animal vocal behavior for conservation.Conservation Science and Practice, 1(8):e72, 2019. ISSN 2578-4854. doi:10.1111/csp2.72

work page doi:10.1111/csp2.72 2019
[5]

Geographic Variation in Acoustic Signals in Wildlife: A System- atic Review.Journal of Biogeography, 52(6):e15116, 2025

Esther Sebastián-González and Cristian Pérez-Granados. Geographic Variation in Acoustic Signals in Wildlife: A System- atic Review.Journal of Biogeography, 52(6):e15116, 2025. ISSN 1365-2699. doi:10.1111/jbi.15116

work page doi:10.1111/jbi.15116 2025
[6]

Camille Desjonquères, Simon Linke, Jack Greenhalgh, Fanny Rybak, and Jérôme Sueur. The potential of acoustic moni- toring of aquatic insects for freshwater assessment.Philosophical Transactions of the Royal Society B: Biological Sciences, 379(1904):20230109, May 2024. doi:10.1098/rstb.2023.0109

work page doi:10.1098/rstb.2023.0109 1904
[7]

Seyfarth and Dorothy L

Robert M. Seyfarth and Dorothy L. Cheney. Production, usage, and comprehension in animal vocalizations.Brain and Language, 115(1):92–100, October 2010. ISSN 0093-934X. doi:10.1016/j.bandl.2009.10.003

work page doi:10.1016/j.bandl.2009.10.003 2010
[8]

Mellinger, Kathleen M

David K. Mellinger, Kathleen M. Staﬀord, Sue E. Moore, Robert P . Dziak, and Haru Matsumoto. An overview of ﬁxed passive acoustic observation methods for cetaceans.Oceanography, 20(4):36–45, 2007

2007
[9]

Calupca, Kurt M

Thomas A. Calupca, Kurt M. Fristrup, and Christopher W. Clark. A compact digital recording system for autonomous bioacoustic monitoring.The Journal of the Acoustical Society of America, 108(5_Supplement):2582, November 2000. ISSN 0001-4966. doi:10.1121/1.4743595

work page doi:10.1121/1.4743595 2000
[10]

T errestrial Pas- sive Acoustic Monitoring: Review and Perspectives.BioScience, 69(1):15–25, January 2019

Larissa Sayuri Moreira Sugai, Thiago Sanna Freire Silva, José Wagner Ribeiro, Jr, and Diego Llusia. T errestrial Pas- sive Acoustic Monitoring: Review and Perspectives.BioScience, 69(1):15–25, January 2019. ISSN 0006-3568. doi:10.1093/biosci/biy147. Kather et al. 15

work page doi:10.1093/biosci/biy147 2019
[11]

Computational bioacoustics with deep learning: A review and roadmap.PeerJ, 10:e13152, March 2022

Dan Stowell. Computational bioacoustics with deep learning: A review and roadmap.PeerJ, 10:e13152, March 2022. ISSN 2167-8359. doi:10.7717/peerj.13152

work page doi:10.7717/peerj.13152 2022
[12]

Foundation Models for Bioacoustics – a Comparative Review, August 2025

Raphael Schwinger, Paria Vali Zadeh, Lukas Rauch, Mats Kurz, T om Hauschild, Sam Lapp, and Sven T omforde. Foundation Models for Bioacoustics – a Comparative Review, August 2025

2025
[13]

What Matters for Bioacoustic Encoding, August 2025

Marius Miron, David Robinson, Milad Alizadeh, Ellen Gilsenan-McMahon, Gagan Narula, Emmanuel Chemla, Maddie Cusimano, Felix Eﬀenberger, Masato Hagiwara, Benjamin Hoﬀman, Sara Keen, Diane Kim, Jane Lawton, Jen-Yu Liu, Aza Raskin, Olivier Pietquin, and Matthieu Geist. What Matters for Bioacoustic Encoding, August 2025

2025
[14]

Scikit-maad: An open- source and modular toolbox for quantitative soundscape analysis in Python.Methods in Ecology and Evolution, 12(12): 2334–2340, 2021

Juan Sebastián Ulloa, Sylvain Haupert, Juan Felipe Latorre, Thierry Aubin, and Jérôme Sueur. Scikit-maad: An open- source and modular toolbox for quantitative soundscape analysis in Python.Methods in Ecology and Evolution, 12(12): 2334–2340, 2021. ISSN 2041-210X. doi:10.1111/2041-210X.13711

work page doi:10.1111/2041-210x.13711 2021
[15]

Unsupervised classiﬁcation to improve the quality of a bird song recording dataset.Ecological Informatics, 74:101952, May 2023

Félix Michaud, Jérôme Sueur, Maxime Le Cesne, and Sylvain Haupert. Unsupervised classiﬁcation to improve the quality of a bird song recording dataset.Ecological Informatics, 74:101952, May 2023. ISSN 1574-9541. doi:10.1016/j.ecoinf.2022.101952

work page doi:10.1016/j.ecoinf.2022.101952 2023
[16]

OpenSound- scape: An open-source bioacoustics analysis package for Python.Methods in Ecology and Evolution, 14(9):2321–2328,

Sam Lapp, T essa Rhinehart, Louis Freeland-Haynes, Jatin Khilnani, Alexandra Syunkova, and Justin Kitzes. OpenSound- scape: An open-source bioacoustics analysis package for Python.Methods in Ecology and Evolution, 14(9):2321–2328,
[17]

doi:10.1111/2041-210X.14196

ISSN 2041-210X. doi:10.1111/2041-210X.14196

work page doi:10.1111/2041-210x.14196 2041
[18]

sound-scape-explorer, January 2026

Sound-scape-explorer/sound-scape-explorer. sound-scape-explorer, January 2026

2026
[19]

Pykanto: A python library to accelerate research on wild bird song.Methods in Ecology and Evolution, 14(8):1994–2002, August 2023

Nilo Merino Recalde. Pykanto: A python library to accelerate research on wild bird song.Methods in Ecology and Evolution, 14(8):1994–2002, August 2023. ISSN 2041-210X, 2041-210X. doi:10.1111/2041-210X.14155

work page doi:10.1111/2041-210x.14155 1994
[20]

Birds, bats and be- yond: Evaluating generalization in bioacoustics models.Frontiers in Bird Science, 3, July 2024

Bart van Merriënboer, Jenny Hamer, Vincent Dumoulin, Eleni T riantaﬁllou, and T om Denton. Birds, bats and be- yond: Evaluating generalization in bioacoustics models.Frontiers in Bird Science, 3, July 2024. ISSN 2813-3870. doi:10.3389/fbirs.2024.1369756

work page doi:10.3389/fbirs.2024.1369756 2024
[21]

Wood, Maximilian Eibl, and Holger Klinck

Stefan Kahl, Connor M. Wood, Maximilian Eibl, and Holger Klinck. BirdNET: A deep learning solution for avian diversity monitoring.Ecological Informatics, 61:101236, March 2021. ISSN 15749541. doi:10.1016/j.ecoinf.2021.101236

work page doi:10.1016/j.ecoinf.2021.101236 2021
[22]

BirdNET: Applications, performance, pitfalls and future opportunities.Ibis, 165(3):1068–1075,

Cristian Pérez-Granados. BirdNET: Applications, performance, pitfalls and future opportunities.Ibis, 165(3):1068–1075,
[23]

doi:10.1111/ibi.13193

ISSN 1474-919X. doi:10.1111/ibi.13193

work page doi:10.1111/ibi.13193
[24]

Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell

Burooj Ghani, Vincent J. Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell. Generalization in birdsong classiﬁcation: Impact of transfer learning methods and dataset characteristics, September 2024

2024
[25]

Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell

Burooj Ghani, Vincent J. Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, and Dan Stowell. Impact of transfer learning methods and dataset characteristics on generalization in birdsong classiﬁcation.Scientiﬁc Reports, 15(1):16273, May 2025. ISSN 2045-2322. doi:10.1038/s41598-025-00996-2

work page doi:10.1038/s41598-025-00996-2 2025
[26]

The use of BirdNET embeddings as a fast solution to ﬁnd novel sound classes in audio recordings.Frontiers in Ecology and Evolution, 12, January 2025

Slade Allen-Ankins, Sebastian Hoefer, Jacopo Bartholomew, Sheryn Brodie, and Lin Schwarzkopf. The use of BirdNET embeddings as a fast solution to ﬁnd novel sound classes in audio recordings.Frontiers in Ecology and Evolution, 12, January 2025. ISSN 2296-701X. doi:10.3389/fevo.2024.1409407

work page doi:10.3389/fevo.2024.1409407 2025
[27]

BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics, December 2023

Jenny Hamer, Eleni T riantaﬁllou, Bart van Merriënboer, Stefan Kahl, Holger Klinck, T om Denton, and Vincent Dumoulin. BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics, December 2023

2023
[28]

Xeno-canto :: Sharing wildlife sounds from around the world

xeno-canto. Xeno-canto :: Sharing wildlife sounds from around the world. https:/ /xeno-canto.org/, 2025

2025
[29]

AnuraSet: A dataset for benchmarking Neotropical anuran calls identiﬁcation in passive acoustic monitoring, July 2023

Juan Sebastián Cañas, Maria Paula T oro-Gómez, Larissa Sayuri Moreira Sugai, Hernán Darío Benítez Restrepo, Jorge Rudas, Breyner Posso Bautista, Luís Felipe T oledo, Simone Dena, Adão Henrique Rosa Domingos, Franco Leandro de Souza, Selvino Neckel-Oliveira, Anderson da Rosa, Vítor Carvalho-Rocha, José Vinícius Bernardy, José Luiz Mas- sao Moreira Sugai, C...

2023
[30]

Schäfer-Zimmermann, Vlad Demartsev, Baptiste Averly, Kiran Dhanjal-Adams, Mathieu Duteil, Gabriella Gall, Marius Faiß, Lily Johnson-Ulrich, Dan Stowell, Marta B

Julian C. Schäfer-Zimmermann, Vlad Demartsev, Baptiste Averly, Kiran Dhanjal-Adams, Mathieu Duteil, Gabriella Gall, Marius Faiß, Lily Johnson-Ulrich, Dan Stowell, Marta B. Manser, Marie A. Roch, and Ariana Strandburg-Peshkin. Ani- mal2vec and MeerKAT: A self-supervised transformer for rare-event raw audio input and a large-scale reference dataset for bioa...

2024
[31]

The iNaturalist Sounds Dataset

Mustafa Chasmai, Alexander Shepard, Subhransu Maji, and Grant Van Horn. The iNaturalist Sounds Dataset. InThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, November 2024

2024
[32]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, September 2020

Leland McInnes, John Healy, and James Melville. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, September 2020

2020
[33]

Raven Pro: Interactive sound analysis software

Cornell Lab of Ornithology. Raven Pro: Interactive sound analysis software. Version 1.5, 2014

2014
[34]

Masked Autoencoders that Listen.Advances in Neural Information Processing Systems, 35:28708–28720, December 2022

Po-Y ao Huang, Hu Xu, Juncheng Li, Alexei Baevski, Michael Auli, Wojciech Galuba, Florian Metze, and Christoph Fe- ichtenhofer. Masked Autoencoders that Listen.Advances in Neural Information Processing Systems, 35:28708–28720, December 2022

2022
[35]

AudioProtoPNet: An interpretable deep learning model for bird sound classification.Ecological Informatics, 87: 103081, 2025

René Heinrich, Lukas Rauch, Bernhard Sick, and Christoph Scholz. AudioProtoPNet: An interpretable deep learning model for bird sound classiﬁcation.Ecological Informatics, 87:103081, July 2025. ISSN 1574-9541. doi:10.1016/j.ecoinf.2025.103081

work page doi:10.1016/j.ecoinf.2025.103081 2025
[36]

AVES: Animal Vocalization Encoder based on Self-Supervision, October 2022

Masato Hagiwara. AVES: Animal Vocalization Encoder based on Self-Supervision, October 2022

2022
[37]

BEAT s: Audio Pre-T raining with Acoustic T okenizers, December 2022

Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Daniel T ompkins, Zhuo Chen, and Furu Wei. BEAT s: Audio Pre-T raining with Acoustic T okenizers, December 2022

2022
[38]

T ransferable Models for Bioacoustics with Human Lan- guage Supervision, August 2023

David Robinson, Adelaide Robinson, and Lily Akrapongpisak. T ransferable Models for Bioacoustics with Human Lan- guage Supervision, August 2023

2023
[39]

Can Masked Autoen- coders Also Listen to Birds?, August 2025

Lukas Rauch, René Heinrich, Ilyass Moummad, Alexis Joly, Bernhard Sick, and Christoph Scholz. Can Masked Autoen- coders Also Listen to Birds?, August 2025

2025
[40]

Development of a machine learning detector for North Atlantic humpback whale song.The Journal of the Acoustical Society of America, 155(3):2050–2064, March 2024

Vincent Kather, Fabian Seipel, Benoit Berges, Genevieve Davis, Catherine Gibson, Matt Harvey, Lea-Anne Henry, Andrew Stevenson, and Denise Risch. Development of a machine learning detector for North Atlantic humpback whale song.The Journal of the Acoustical Society of America, 155(3):2050–2064, March 2024. ISSN 0001-4966. doi:10.1121/10.0025275

work page doi:10.1121/10.0025275 2050
[41]

Mixture of Mixups for Multi-label Classiﬁcation of Rare Anuran Sounds, March 2024

Ilyass Moummad, Nicolas Farrugia, Romain Serizel, Jeremy Froidevaux, and Vincent Lostanlen. Mixture of Mixups for Multi-label Classiﬁcation of Rare Anuran Sounds, March 2024

2024
[42]

NatureLM-audio: An Audio-Language Founda- tion Model for Bioacoustics, November 2024

David Robinson, Marius Miron, Masato Hagiwara, and Olivier Pietquin. NatureLM-audio: An Audio-Language Founda- tion Model for Bioacoustics, November 2024

2024
[43]

Global birdsong embeddings enable superior trans- fer learning for bioacoustic classiﬁcation.Scientiﬁc Reports, 13(1):22876, December 2023

Burooj Ghani, T om Denton, Stefan Kahl, and Holger Klinck. Global birdsong embeddings enable superior trans- fer learning for bioacoustic classiﬁcation.Scientiﬁc Reports, 13(1):22876, December 2023. ISSN 2045-2322. doi:10.1038/s41598-023-49989-z

work page doi:10.1038/s41598-023-49989-z 2023
[44]

Perch 2.0: The Bittern Lesson for Bioacoustics, August 2025

Bart van Merriënboer, Vincent Dumoulin, Jenny Hamer, Lauren Harrell, Andrea Burns, and T om Denton. Perch 2.0: The Bittern Lesson for Bioacoustics, August 2025

2025
[45]

Domain-Invariant Representation Learning of Bird Sounds

Ilyass Moummad, Romain Serizel, Emmanouil Benetos, and Nicolas Farrugia. Domain-Invariant Representation Learning of Bird Sounds. September 2024

2024
[46]

Regularized Contrastive Pre-training for Few-shot Bioacoustic Sound Detection, January 2024

Ilyass Moummad, Romain Serizel, and Nicolas Farrugia. Regularized Contrastive Pre-training for Few-shot Bioacoustic Sound Detection, January 2024. Kather et al. 17

2024
[47]

Fleishman, Matthew McKown, Jill E

Ben Williams, Bart van Merriënboer, Vincent Dumoulin, Jenny Hamer, Eleni T riantaﬁllou, Abram B. Fleishman, Matthew McKown, Jill E. Munger, Aaron N. Rice, Ashlee Lillis, Clemency E. White, Catherine A. D. Hobbs, T ries B. Razak, Kate E. Jones, and T om Denton. Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bio...

2024
[48]

Shawn Hershey, Sourish Chaudhuri, Daniel P . W. Ellis, Jort F . Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, and Kevin Wilson. CNN Architectures for Large-Scale Audio Classiﬁcation, January 2017

2017
[49]

Full-Band General Audio Synthesis with Score-Based Diffusion

Masato Hagiwara, Benjamin Hoﬀman, Jen-Yu Liu, Maddie Cusimano, Felix Eﬀenberger, and Katie Zacarian. BEANS: The Benchmark of Animal Sounds. InICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5, June 2023. doi:10.1109/ICASSP49357.2023.10096686

work page doi:10.1109/icassp49357.2023.10096686 2023
[50]

Understanding intermediate layers using linear classiﬁer probes, November 2018

Guillaume Alain and Y oshua Bengio. Understanding intermediate layers using linear classiﬁer probes, November 2018

2018
[51]

Kather, Burooj Ghani, and Dan Stowell

Vincent S. Kather, Burooj Ghani, and Dan Stowell. Clustering and Novel Class Recognition: Evaluating Bioacoustic Deep Learning Feature Extractors. InProceedings of the 11th Convention of the European Acoustics Association Forum Acusticum / EuroNoise 2025, 2025. doi:10.61782/fa.2025.0231

work page doi:10.61782/fa.2025.0231 2025
[52]

Standardized Mutual Information for Clustering Com- parisons: One Step Further in Adjustment for Chance

Simone Romano, James Bailey, Vinh Nguyen, and Karin Verspoor. Standardized Mutual Information for Clustering Com- parisons: One Step Further in Adjustment for Chance. InProceedings of the 31st International Conference on Machine Learning, pages 1143–1151. PMLR, June 2014

2014
[53]

Brusco, and Lawrence Hubert

Douglas Steinley, Michael J. Brusco, and Lawrence Hubert. The variance of the adjusted Rand index.Psychological Methods, 21(2):261–272, 2016. ISSN 1939-1463. doi:10.1037/met0000049

work page doi:10.1037/met0000049 2016
[54]

Plotly: Create Interactive Web Graphics via ’plotly.js’ , November 2015

Carson Sievert, Chris Parmer, T oby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. Plotly: Create Interactive Web Graphics via ’plotly.js’ , November 2015

2015
[55]

Visualizing data using t-SNE.Journal of machine learning research, 9(11), 2008

Laurens Van der Maaten and Geoﬀrey Hinton. Visualizing data using t-SNE.Journal of machine learning research, 9(11), 2008

2008
[56]

Principal component analysis.Chemometrics and Intelligent Laboratory Systems, 2(1):37–52, August 1987

Svante Wold, Kim Esbensen, and Paul Geladi. Principal component analysis.Chemometrics and Intelligent Laboratory Systems, 2(1):37–52, August 1987. ISSN 0169-7439. doi:10.1016/0169-7439(87)80084-9. 18 Kather et al. 10|APPENDIX In this section further materials are provided. TA B L E 2Selection of API functions ofbacpipe. All pipelines can be used with the ...

work page doi:10.1016/0169-7439(87)80084-9 1987