SEABAD: A Tropical Bird Activity Detection Dataset for Passive Acoustic Monitoring

Mohd Yamani Idna Idris; Muhammad Mun'im Ahmad Zabidi; Norisma Idris

arxiv: 2605.20853 · v1 · pith:2H5YBSVPnew · submitted 2026-05-20 · 💻 cs.SD · eess.AS

SEABAD: A Tropical Bird Activity Detection Dataset for Passive Acoustic Monitoring

Muhammad Mun'im Ahmad Zabidi , Mohd Yamani Idna Idris , Norisma Idris This is my paper

Pith reviewed 2026-05-21 02:11 UTC · model grok-4.3

classification 💻 cs.SD eess.AS

keywords bird audio detectionpassive acoustic monitoringtropical soundscapesaudio datasetSoutheast Asiamachine learningbiodiversity monitoringedge deployment

0 comments

The pith

A new dataset of 50,000 balanced tropical audio clips supports accurate detection of bird vocalizations in dense Southeast Asian soundscapes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SEABAD to address the shortage of training data for bird audio detection systems suited to tropical rather than temperate environments. Tropical soundscapes contain more overlapping sounds and species, making standard detectors less effective and increasing the volume of useless recordings that must be stored or transmitted. The authors apply a dual-branch curation process to create an evenly split collection of three-second clips, standardize the audio format, and verify label quality through audit before showing strong baseline classification results. If successful, this resource would let monitoring projects discard irrelevant audio earlier and run on lower-power hardware for extended field use.

Core claim

We introduce SEABAD, a dataset of 50,000 curated three-second clips from Southeast Asian soundscapes, evenly balanced between bird-present and bird-absent samples spanning 1,677 bird species and standardized to 16 kHz mono audio. A dual-branch curation pipeline applies a six-stage positive-label workflow to Xeno-Canto recordings alongside source-specific negative-label extractions from environmental datasets, reducing class imbalance. A manual audit of 1,000 clips and baseline experiments with MobileNetV3-Small reaching 99.57 percent accuracy and 0.9985 AUC confirm the dataset supports reliable tropical bird activity detection.

What carries the argument

The dual-branch curation pipeline that combines a six-stage positive-label workflow on bird recordings with source-specific negative extractions from environmental datasets to generate balanced positive and negative samples.

Load-bearing premise

The six-stage positive-label workflow and source-specific negative extractions produce reliable labels for tropical soundscapes that generalize from the audited subset to the full 50,000-clip collection.

What would settle it

An independent collection of Southeast Asian field recordings labeled by the same workflow but showing substantially lower accuracy when classified by models trained on SEABAD would indicate that the labels or soundscape coverage do not hold.

Figures

Figures reproduced from arXiv: 2605.20853 by Mohd Yamani Idna Idris, Muhammad Mun'im Ahmad Zabidi, Norisma Idris.

**Figure 2.** Figure 2: Quality assurance spectrograms showing acoustic diversity across SEABAD positive samples. This 5 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Species distribution before and after diversity-aware balancing. Pre-balancing (left): 38,481 clips across 1,677 species, Gini [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Geographic distribution of SEABAD positive recordings [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

read the original abstract

Passive acoustic monitoring (PAM) enables large-scale biodiversity assessment, but continuous recording generates large amounts of non-informative audio, creating challenges for storage, power consumption, and long-term edge deployment. Bird audio detection (BAD), which identifies bird vocalizations, can reduce this burden by filtering irrelevant recordings before downstream analysis. However, most BAD systems are trained on temperate datasets despite tropical soundscapes being denser, more species-rich, and acoustically unpredictable. To address this gap, we introduce SEABAD (Southeast Asian Bird Activity Detection), a dataset of 50,000 curated three-second clips from Southeast Asian soundscapes, evenly balanced between bird-present and bird-absent samples. The dataset spans 1,677 bird species and is standardized to 16 kHz mono audio for embedded and low-power inference. We developed a dual-branch curation pipeline: a six-stage positive-label workflow applied to Xeno-Canto recordings, alongside six source-specific negative-label extractions from environmental datasets. These procedures reduced class imbalance by 13.7% (Gini coefficient: 0.601 to 0.519). A manual audit of 1,000 positive clips confirmed 97.8% +/- 0.9% labeling accuracy. Baseline experiments using MobileNetV3-Small achieved 99.57% +/- 0.25% accuracy and 0.9985 +/- 0.0002 AUC across three random seeds. SEABAD and the full curation pipeline are publicly released to support tropical BAD research and energy-efficient acoustic monitoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces SEABAD, a dataset of 50,000 three-second audio clips from Southeast Asian soundscapes for bird activity detection (BAD) in passive acoustic monitoring (PAM). It is evenly balanced between bird-present and bird-absent classes, spans 1,677 species, and is standardized to 16 kHz mono. The authors describe a dual-branch curation pipeline consisting of a six-stage positive-label workflow on Xeno-Canto recordings and six source-specific negative-label extractions from environmental datasets, which reduced class imbalance (Gini coefficient from 0.601 to 0.519). A manual audit of 1,000 positive clips reports 97.8% +/- 0.9% labeling accuracy. Baseline experiments with MobileNetV3-Small achieve 99.57% +/- 0.25% accuracy and 0.9985 +/- 0.0002 AUC across three seeds. The dataset and pipeline are publicly released to support tropical BAD research and energy-efficient monitoring.

Significance. If the curation process produces clips that faithfully represent the denser, overlapping, and unpredictable acoustics of real tropical PAM deployments, SEABAD would address a clear gap in existing temperate-focused BAD datasets and enable development of models suitable for low-power edge filtering. The public release of both data and the full curation pipeline is a concrete strength that supports reproducibility and extension by other researchers.

major comments (2)

[§3 (Dataset Curation)] §3 (Dataset Curation): The six-stage positive-label workflow applied to Xeno-Canto recordings (typically focal, high-SNR clips of individual species) paired with source-specific negatives extracted from separate environmental datasets risks producing an artificially separable task. This construction does not demonstrably capture the multi-species overlaps, faint calls, and masking sounds that characterize continuous tropical PAM recordings, which directly weakens the claim that the 99.57% baseline demonstrates utility for the intended real-world filtering application.
[Manual Audit paragraph] Manual Audit paragraph: The reported 97.8% +/- 0.9% accuracy is based on an audit of only 1,000 positive clips; no equivalent audit or inter-annotator details are provided for the negative samples, and the audit does not assess whether the selected clips exhibit the dense, unpredictable acoustic properties asserted for tropical soundscapes. This leaves the generalization to the full 50,000-clip set and to realistic PAM conditions unverified.

minor comments (2)

[Abstract / Dataset Curation] The reduction of class imbalance by 13.7% via the Gini coefficient (0.601 to 0.519) is stated without the explicit formula or per-source breakdown used in the calculation.
[Baseline Experiments] No details are given on how the three random seeds were chosen or whether the reported standard deviations reflect only seed variation or also hyperparameter sensitivity in the MobileNetV3-Small baseline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for their insightful comments, which have helped us improve the clarity and transparency of our work on the SEABAD dataset. We address each major comment in detail below, indicating where revisions have been made to the manuscript.

read point-by-point responses

Referee: [§3 (Dataset Curation)] The six-stage positive-label workflow applied to Xeno-Canto recordings (typically focal, high-SNR clips of individual species) paired with source-specific negatives extracted from separate environmental datasets risks producing an artificially separable task. This construction does not demonstrably capture the multi-species overlaps, faint calls, and masking sounds that characterize continuous tropical PAM recordings, which directly weakens the claim that the 99.57% baseline demonstrates utility for the intended real-world filtering application.

Authors: We appreciate this observation and agree that the curation approach, relying on focal Xeno-Canto recordings for positive samples and separate environmental sources for negatives, may not fully replicate the complexities of real-world tropical PAM, such as overlapping calls and masking noise. Our intent was to provide a balanced, publicly available dataset to bootstrap research in this under-represented domain, rather than to claim direct equivalence to continuous recordings. The high baseline accuracy reflects performance on this curated set, which we believe serves as a useful benchmark. To strengthen the manuscript, we have added a dedicated limitations paragraph in Section 5 discussing the differences between curated clips and dense PAM soundscapes, along with suggestions for future work involving raw continuous recordings. revision: yes
Referee: [Manual Audit paragraph] The reported 97.8% +/- 0.9% accuracy is based on an audit of only 1,000 positive clips; no equivalent audit or inter-annotator details are provided for the negative samples, and the audit does not assess whether the selected clips exhibit the dense, unpredictable acoustic properties asserted for tropical soundscapes. This leaves the generalization to the full 50,000-clip set and to realistic PAM conditions unverified.

Authors: We acknowledge the limitations of the audit scope. The manual audit focused on positive clips to verify the presence of bird vocalizations from the specified species, as negative clips were extracted from sources documented to lack avian activity. We have now included additional details in the revised manuscript about the verification process for negative samples, including source documentation and spot-checks. Regarding the assessment of acoustic density, we note that this was not part of the audit design, as the primary goal was label accuracy. We have revised the text to clarify the audit's purpose and added a statement on the need for further validation in realistic conditions. A comprehensive inter-annotator agreement study across the entire dataset would require substantial additional resources and is planned for future extensions. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical dataset release with direct experimental baselines

full rationale

The paper introduces SEABAD as a curated dataset of 50,000 clips with a described six-stage positive workflow and source-specific negative extraction, followed by a manual audit and MobileNetV3-Small baseline runs reporting accuracy and AUC. No equations, first-principles derivations, or predictions appear. The reported 99.57% accuracy is an empirical measurement on the released data, not a fitted parameter or self-defined quantity renamed as a prediction. No self-citations are invoked to justify uniqueness or load-bearing premises. The curation process is presented as a transparent pipeline whose outputs are the dataset itself; nothing reduces to its own inputs by construction. This is a standard empirical contribution whose central claims rest on the audit and baseline numbers rather than any circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard audio preprocessing conventions and the domain assumption that the described curation workflow yields accurate labels; no free parameters, invented entities, or ad-hoc axioms beyond these are introduced.

axioms (2)

standard math Audio is standardized to 16 kHz mono
Standard practice for embedded audio inference stated in the abstract.
domain assumption The dual-branch curation pipeline produces reliable positive and negative labels
Central to dataset validity; supported by the manual audit but not independently verified beyond the 1,000-clip sample.

pith-pipeline@v0.9.0 · 5824 in / 1262 out tokens · 36820 ms · 2026-05-21T02:11:15.213242+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

195 extracted references · 195 canonical work pages · 2 internal anchors

[1]

Methods in Ecology and Evolution , volume =

TweetyNet: A neural network that learns to segment and label birdsong and other temporal patterns , author =. Methods in Ecology and Evolution , volume =. 2022 , publisher =

work page 2022
[2]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , year =

BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds , author =. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , year =

work page
[3]

2021 , journal =

Compensating class imbalance for acoustic chimpanzee detection with convolutional recurrent neural networks , author =. 2021 , journal =

work page 2021
[4]

2021 , journal =

Low Resource Species Agnostic Bird Activity Detection , author =. 2021 , journal =. doi:10.1109/SiPS52927.2021.00015 , isbn =

work page doi:10.1109/sips52927.2021.00015 2021
[5]

Micronets: Neural network architectures for deploying

Banbury, Colby and Zhou, Chuteng and Fedorov, Igor and Matas, Ramon and Thakker, Urmish and Gope, Dibakar and Janapa Reddi, Vijay and Mattina, Matthew and Whatmough, Paul , year =. Micronets: Neural network architectures for deploying. Proceedings of machine learning and systems , volume =

work page
[6]

Forest sound classification dataset:

Bandara, Meelan and Jayasundara, Roshinie and Ariyarathne, Isuru and Meedeniya, Dulani and Perera, Charith , year =. Forest sound classification dataset:. Sensors , publisher =

work page
[7]

and Riesch, Rüdiger and Koricheva, Julia , year =

Beason, Richard D. and Riesch, Rüdiger and Koricheva, Julia , year =. Bioacoustics , publisher =. doi:10.1080/09524622.2018.1463293 , issn =

work page doi:10.1080/09524622.2018.1463293 2018
[8]

2023 , journal =

Correcting for the effects of class imbalance improves the performance of machine-learning based species distribution models , author =. 2023 , journal =

work page 2023
[9]

2024 , journal =

Semantic Segmentation of Bird Audio Patterns Using a Custom-Built Convolutional Neural Network , author =. 2024 , journal =

work page 2024
[10]

Hearing to the unseen:

Bota, Gerard and Manzano-Rubio, Robert and Catal. Hearing to the unseen:. 2023 , journal =

work page 2023
[11]

2022 , journal =

Loss of temporal structure of tropical soundscapes with intensifying land use in Borneo , author =. 2022 , journal =

work page 2022
[12]

2022 , journal =

Soundscape monitoring for biodiversity assessment in tropical forests , author =. 2022 , journal =

work page 2022
[13]

2023 , journal =

Development of Parametric Filter Banks for Sound Feature Extraction , author =. 2023 , journal =

work page 2023
[14]

Convolutional Recurrent Neural Networks for Glucose Prediction

Convolutional Recurrent Neural Networks for Bird Audio Detection , author =. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages =. 2017 , month =. doi:arXiv:1807.03043v4 , isbn =

work page internal anchor Pith review Pith/arXiv arXiv 2017
[15]

2003 , publisher =

Bird song: biological themes and variations , author =. 2003 , publisher =

work page 2003
[16]

2019 , journal =

Neural Network Distillation on IoT Platforms for Sound Event Detection , author =. 2019 , journal =. doi:10.21437/Interspeech.2019-2394 , issn =

work page doi:10.21437/interspeech.2019-2394 2019
[17]

ACM International Conference Proceeding Series , publisher =

Sound Event Detection With Binary Neural Networks on Tightly Power-Constrained IoT Devices , author =. ACM International Conference Proceeding Series , publisher =. 2020 , month =. doi:10.1145/3370748.3406588 , isbn =

work page doi:10.1145/3370748.3406588 2020
[18]

Chasmai, Mustafa and Shepard, Alexander and Maji, Subhransu and Van Horn, Grant , year =. The

work page
[19]

2021 , journal =

Novel Methods to Correct for Observer and Sampling Bias in Presence-Only Species Distribution Models , author =. 2021 , journal =

work page 2021
[20]

Journal of Artificial Intelligence Research , volume =

Chawla, Nitesh V and Bowyer, Kevin W and Hall, Lawrence O and Kegelmeyer, W Philip , year =. Journal of Artificial Intelligence Research , volume =

work page
[21]

2024 , journal =

Efficient deep neural network compression for environmental sound classification on microcontroller units , author =. 2024 , journal =. doi:10.55730/1300-0632.4084 , issn =

work page doi:10.55730/1300-0632.4084 2024
[22]

Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 , publisher =

Xception: Deep Learning with Depthwise Separable Convolutions , author =. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 , publisher =. 2017 , month =. doi:10.1109/CVPR.2017.195 , isbn =

work page doi:10.1109/cvpr.2017.195 2017
[23]

2025 , journal =

Enabling Multi-Species Bird Classification on Low-Power Bioacoustic Loggers , author =. 2025 , journal =

work page 2025
[24]

1977 , publisher =

Sampling Techniques , author =. 1977 , publisher =

work page 1977
[25]

1960 , journal =

A coefficient of agreement for nominal scales , author =. 1960 , journal =

work page 1960
[26]

2021 , journal =

Acoustic sensors , author =. 2021 , journal =

work page 2021
[27]

2019 , booktitle =

Class-Balanced Loss Based on Effective Number of Samples , author =. 2019 , booktitle =

work page 2019
[28]

2019 , journal =

Evaluation of Classical Machine Learning Techniques towards Urban Sound Recognition on Embedded Systems , author =. 2019 , journal =. doi:10.3390/app9183885 , abstract =

work page doi:10.3390/app9183885 2019
[29]

2018 , journal =

Ecological diversity: measuring the unmeasurable , author =. 2018 , journal =

work page 2018
[30]

Tensorflow

David, Robert and Duke, Jared and Jain, Advait and Janapa Reddi, Vijay and Jeffries, Nat and Li, Jian and Kreeger, Nick and Nappier, Ian and Natraj, Meghna and Wang, Tiezhen and others , year =. Tensorflow. Proceedings of machine learning and systems , volume =

work page
[31]

1980 , journal =

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences , author =. 1980 , journal =

work page 1980
[32]

2025 , journal =

A Hybrid CNN-LSTM Model for Environmental Sound Classification: Leveraging Feature Engineering and Transfer Learning , author =. 2025 , journal =. doi:10.1016/j.dsp.2025.104079 , url =

work page doi:10.1016/j.dsp.2025.104079 2025
[33]

Scientific Reports , publisher =

Fast Environmental Sound Classification Based on Resource Adaptive Convolutional Neural Network , author =. Scientific Reports , publisher =. 2022 , month =. doi:10.1038/s41598-022-10382-x , issn =

work page doi:10.1038/s41598-022-10382-x 2022
[34]

2017 , journal =

Freesound Datasets: a Platform for the Creation of Open Audio Datasets , author =. 2017 , journal =

work page 2017
[35]

General-purpose Tagging of

Fonseca, Eduardo and Plakal, Manoj and Font, Frederic and Ellis, Daniel P W and Favory, Xavier and Pons, Jordi and Serra, Xavier , year =. General-purpose Tagging of

work page
[36]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , publisher =

Fonseca, Eduardo and Favory, Xavier and Pons, Jordi and Font, Frederic and Serra, Xavier , year =. IEEE/ACM Transactions on Audio, Speech, and Language Processing , publisher =

work page
[37]

2020 , journal =

A state-of-the-art review on birds as indicators of biodiversity: Advances, challenges, and future directions , author =. 2020 , journal =

work page 2020
[38]

2025 , journal =

Environmental Noise Dataset for Sound Event Classification and Detection , author =. 2025 , journal =. doi:10.1038/s41597-025-05991-w , issn =

work page doi:10.1038/s41597-025-05991-w 2025
[39]

Artificial Intelligence Review , publisher =

Environmental sound recognition on embedded devices using deep learning: a review , author =. Artificial Intelligence Review , publisher =. 2025 , month =. doi:10.1007/s10462-025-11106-z , issn =

work page doi:10.1007/s10462-025-11106-z 2025
[40]

2026 , booktitle =

Monitoring with Machines: A Review of Computational Bioacoustics , author =. 2026 , booktitle =. doi:10.1007/978-3-032-05821-8_16 , isbn =

work page doi:10.1007/978-3-032-05821-8_16 2026
[41]

1909 , journal =

Concentration and dependency ratios , author =. 1909 , journal =

work page 1909
[42]

Variabilit

Gini, Corrado , year =. Variabilit

work page
[43]

Giorgi, Giovanni Maria and Gigliarano, Chiara , year =. The. Journal of Economic Surveys , publisher =

work page
[44]

CLEF: Conference and Labs of the Evaluation Forum , address =

Goeau, Hervé and Glotin, Hervé and Vellinga, Willem-pier and Planque, Robert and Joly, Alexis , year =. CLEF: Conference and Labs of the Evaluation Forum , address =

work page
[45]

arXiv preprint arXiv:2104.01778 , url =

Gong, Yuan and Chung, Yu-An and Glass, James , year =. arXiv preprint arXiv:2104.01778 , url =

work page arXiv
[46]

25th European Signal Processing Conference, EUSIPCO 2017 , publisher =

Two Convolutional Neural Networks for Bird Detection in Audio Signals , author =. 25th European Signal Processing Conference, EUSIPCO 2017 , publisher =. 2017 , month =. doi:10.23919/EUSIPCO.2017.8081512 , isbn =

work page doi:10.23919/eusipco.2017.8081512 2017
[47]

2021 , journal =

Comparing Recurrent Convolutional Neural Networks for Large Scale Bird Species Classification , author =. 2021 , journal =

work page 2021
[48]

, year =

Han, Song and Mao, Huizi and Dally, William J. , year =. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and. ICLR 2016 , url =

work page 2016
[49]

2009 , journal =

Learning from imbalanced data , author =. 2009 , journal =

work page 2009
[50]

2016 , booktitle =

Deep Residual Learning for Image Recognition , author =. 2016 , booktitle =

work page 2016
[51]

2021 , journal =

Addressing class imbalance in image-based biodiversity monitoring , author =. 2021 , journal =

work page 2021
[52]

HardwareX , publisher =

Hill, Andrew P and Prince, Peter and Snaddon, Jake L and Doncaster, C Patrick and Rogers, Alex , year =. HardwareX , publisher =. doi:10.1016/j.ohx.2019.e00073 , issn =

work page doi:10.1016/j.ohx.2019.e00073 2019
[53]

2015 , month =

Distilling the Knowledge in a Neural Network , author =. 2015 , month =

work page 2015
[54]

and Farwig, Nina and Freisleben, Bernd , year =

Hoechst, Jonas and Bellafkir, Hicham and Lampe, Patrick and Vogelbacher, Markus and Muhling, Markus and Schneider, Daniel and Lindner, Kim and Rosner, Sascha and Schabo, Dana G. and Farwig, Nina and Freisleben, Bernd , year =. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatic...

work page doi:10.1007/978-3-031-17436-0
[55]

2025 , journal =

Essential Steps for Establishing a Large-scale Passive Acoustic Monitoring for an Elusive Forest Bird Species: The Eurasian Woodcock (Scolopax rusticola) , author =. 2025 , journal =

work page 2025
[56]

Howard and Menglong Zhu and Bo Chen and Dmitry Kalenichenko and Weijun Wang and Tobias Weyand and Marco Andreetto and Hartwig Adam , year =

Andrew G. Howard and Menglong Zhu and Bo Chen and Dmitry Kalenichenko and Weijun Wang and Tobias Weyand and Marco Andreetto and Hartwig Adam , year =. CoRR , volume =

work page
[57]

Le and Hartwig Adam , year =

Andrew Howard and Mark Sandler and Grace Chu and Liang-Chieh Chen and Bo Chen and Mingxing Tan and Weijun Wang and Yukun Zhu and Ruoming Pang and Vijay Vasudevan and Quoc V. Le and Hartwig Adam , year =. Searching for. The IEEE International Conference on Computer Vision (ICCV) , pages =. doi:10.1109/ICCV.2019.00140 , url =

work page doi:10.1109/iccv.2019.00140 2019
[58]

2018 , booktitle =

Squeeze-and-Excitation Networks , author =. 2018 , booktitle =

work page 2018
[59]

Deep Learning Bird Song Recognition Based on

Hu, Shipeng and Chu, Yihang and Wen, Zhifang and Zhou, Guoxiong and Sun, Yurong and Chen, Aibin , year =. Deep Learning Bird Song Recognition Based on. Ecological Indicators , publisher =. doi:10.1016/j.ecolind.2023.110844 , issn =

work page doi:10.1016/j.ecolind.2023.110844 2023
[60]

2024 , booktitle =

Huang, Zhaolan and Tousnakhoff, Adrien and Kozyr, Polina and Rehausen, Roman and Bie. 2024 , booktitle =

work page 2024
[61]

2021 , journal =

Sampling biases shape our view of the natural world , author =. 2021 , journal =. doi:https://doi.org/10.1111/ecog.05926 , keywords =. https://nsojournals.onlinelibrary.wiley.com/doi/pdf/10.1111/ecog.05926 , abstract =

work page doi:10.1111/ecog.05926 2021
[62]

2025 , journal =

Understanding the adequacy and representativeness of species distribution data , author =. 2025 , journal =

work page 2025
[63]

and Bayne, Erin M

Huus, Jan and Kelly, Kevin G. and Bayne, Erin M. and Knight, Elly C. , year =. Ecological Informatics , publisher =. doi:10.1016/j.ecoinf.2025.103122 , issn =

work page doi:10.1016/j.ecoinf.2025.103122 2025
[64]

2015 , journal =

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , author =. 2015 , journal =

work page 2015
[65]

2022 , journal =

Sampling and Modelling Rare Species: Conceptual Guidelines for the Neglected Majority , author =. 2022 , journal =

work page 2022
[66]

2024 , booktitle =

Weight Light, Hear Right: Heart Sound Classification With a Low-Complexity Model , author =. 2024 , booktitle =

work page 2024
[67]

2002 , booktitle =

Music type classification by spectral contrast feature , author =. 2002 , booktitle =

work page 2002
[68]

Billion-scale similarity search with

Johnson, Jeff and Douze, Matthijs and J. Billion-scale similarity search with. 2019 , journal =

work page 2019
[69]

, year =

Jolles, Jolle W. , year =. Broad-Scale Applications of the. Methods in Ecology and Evolution , publisher =. doi:10.1111/2041-210X.13652 , issn =

work page doi:10.1111/2041-210x.13652 2041
[70]

Overview of

Joly, Alexis and Go. Overview of. 2018 , booktitle =

work page 2018
[71]

Recognizing Birds from Sound - The 2018

Kahl, Stefan and Wilhelm-Stein, Thomas and Klinck, Holger and Kowerko, Danny and Eibl, Maximilian , year =. Recognizing Birds from Sound - The 2018. arXiv , number =

work page 2018
[72]

Ecological Informatics , publisher =

Kahl, Stefan and Wood, Connor M and Eibl, Maximilian and Klinck, Holger , year =. Ecological Informatics , publisher =

work page
[73]

Overview of

Kahl, Stefan and Denton, Tom and Klinck, Holger and Glotin, Herv. Overview of. 2021 , booktitle =

work page 2021
[74]

Ecological Indicators , publisher =

Automated detection of gunshots in tropical forests using convolutional neural networks , author =. Ecological Indicators , publisher =. 2022 , month =. doi:10.1016/j.ecolind.2022.109128 , issn =

work page doi:10.1016/j.ecolind.2022.109128 2022
[75]

2025 , journal =

Automatic Detection for Bioacoustic Research: A Practical Guide From and for Biologists and Computer Scientists , author =. 2025 , journal =

work page 2025
[76]

2020 , journal =

Animal Sounds Classification Scheme Based on Multi-Feature Network with Mixed Datasets , author =. 2020 , journal =

work page 2020
[77]

2015 , booktitle =

Adam: A Method for Stochastic Optimization , author =. 2015 , booktitle =

work page 2015
[78]

CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs

Liangzhen Lai and Naveen Suda and Vikas Chandra , year =. arXiv preprint arXiv:1801.06601 , publisher =

work page internal anchor Pith review Pith/arXiv arXiv
[79]

A quantitative evaluation of the performance of the low-cost

Lapp, Sam and Stahlman, Nickolus and Kitzes, Justin , year =. A quantitative evaluation of the performance of the low-cost. Sensors , publisher =

work page
[80]

2024 , journal =

Computational Bioacoustics and Automated Recognition of Bird Vocalizations: New Tools, Applications and Methods for Bird Monitoring , author =. 2024 , journal =

work page 2024

Showing first 80 references.

[1] [1]

Methods in Ecology and Evolution , volume =

TweetyNet: A neural network that learns to segment and label birdsong and other temporal patterns , author =. Methods in Ecology and Evolution , volume =. 2022 , publisher =

work page 2022

[2] [2]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , year =

BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds , author =. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , year =

work page

[3] [3]

2021 , journal =

Compensating class imbalance for acoustic chimpanzee detection with convolutional recurrent neural networks , author =. 2021 , journal =

work page 2021

[4] [4]

2021 , journal =

Low Resource Species Agnostic Bird Activity Detection , author =. 2021 , journal =. doi:10.1109/SiPS52927.2021.00015 , isbn =

work page doi:10.1109/sips52927.2021.00015 2021

[5] [5]

Micronets: Neural network architectures for deploying

Banbury, Colby and Zhou, Chuteng and Fedorov, Igor and Matas, Ramon and Thakker, Urmish and Gope, Dibakar and Janapa Reddi, Vijay and Mattina, Matthew and Whatmough, Paul , year =. Micronets: Neural network architectures for deploying. Proceedings of machine learning and systems , volume =

work page

[6] [6]

Forest sound classification dataset:

Bandara, Meelan and Jayasundara, Roshinie and Ariyarathne, Isuru and Meedeniya, Dulani and Perera, Charith , year =. Forest sound classification dataset:. Sensors , publisher =

work page

[7] [7]

and Riesch, Rüdiger and Koricheva, Julia , year =

Beason, Richard D. and Riesch, Rüdiger and Koricheva, Julia , year =. Bioacoustics , publisher =. doi:10.1080/09524622.2018.1463293 , issn =

work page doi:10.1080/09524622.2018.1463293 2018

[8] [8]

2023 , journal =

Correcting for the effects of class imbalance improves the performance of machine-learning based species distribution models , author =. 2023 , journal =

work page 2023

[9] [9]

2024 , journal =

Semantic Segmentation of Bird Audio Patterns Using a Custom-Built Convolutional Neural Network , author =. 2024 , journal =

work page 2024

[10] [10]

Hearing to the unseen:

Bota, Gerard and Manzano-Rubio, Robert and Catal. Hearing to the unseen:. 2023 , journal =

work page 2023

[11] [11]

2022 , journal =

Loss of temporal structure of tropical soundscapes with intensifying land use in Borneo , author =. 2022 , journal =

work page 2022

[12] [12]

2022 , journal =

Soundscape monitoring for biodiversity assessment in tropical forests , author =. 2022 , journal =

work page 2022

[13] [13]

2023 , journal =

Development of Parametric Filter Banks for Sound Feature Extraction , author =. 2023 , journal =

work page 2023

[14] [14]

Convolutional Recurrent Neural Networks for Glucose Prediction

Convolutional Recurrent Neural Networks for Bird Audio Detection , author =. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages =. 2017 , month =. doi:arXiv:1807.03043v4 , isbn =

work page internal anchor Pith review Pith/arXiv arXiv 2017

[15] [15]

2003 , publisher =

Bird song: biological themes and variations , author =. 2003 , publisher =

work page 2003

[16] [16]

2019 , journal =

Neural Network Distillation on IoT Platforms for Sound Event Detection , author =. 2019 , journal =. doi:10.21437/Interspeech.2019-2394 , issn =

work page doi:10.21437/interspeech.2019-2394 2019

[17] [17]

ACM International Conference Proceeding Series , publisher =

Sound Event Detection With Binary Neural Networks on Tightly Power-Constrained IoT Devices , author =. ACM International Conference Proceeding Series , publisher =. 2020 , month =. doi:10.1145/3370748.3406588 , isbn =

work page doi:10.1145/3370748.3406588 2020

[18] [18]

Chasmai, Mustafa and Shepard, Alexander and Maji, Subhransu and Van Horn, Grant , year =. The

work page

[19] [19]

2021 , journal =

Novel Methods to Correct for Observer and Sampling Bias in Presence-Only Species Distribution Models , author =. 2021 , journal =

work page 2021

[20] [20]

Journal of Artificial Intelligence Research , volume =

Chawla, Nitesh V and Bowyer, Kevin W and Hall, Lawrence O and Kegelmeyer, W Philip , year =. Journal of Artificial Intelligence Research , volume =

work page

[21] [21]

2024 , journal =

Efficient deep neural network compression for environmental sound classification on microcontroller units , author =. 2024 , journal =. doi:10.55730/1300-0632.4084 , issn =

work page doi:10.55730/1300-0632.4084 2024

[22] [22]

Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 , publisher =

Xception: Deep Learning with Depthwise Separable Convolutions , author =. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 , publisher =. 2017 , month =. doi:10.1109/CVPR.2017.195 , isbn =

work page doi:10.1109/cvpr.2017.195 2017

[23] [23]

2025 , journal =

Enabling Multi-Species Bird Classification on Low-Power Bioacoustic Loggers , author =. 2025 , journal =

work page 2025

[24] [24]

1977 , publisher =

Sampling Techniques , author =. 1977 , publisher =

work page 1977

[25] [25]

1960 , journal =

A coefficient of agreement for nominal scales , author =. 1960 , journal =

work page 1960

[26] [26]

2021 , journal =

Acoustic sensors , author =. 2021 , journal =

work page 2021

[27] [27]

2019 , booktitle =

Class-Balanced Loss Based on Effective Number of Samples , author =. 2019 , booktitle =

work page 2019

[28] [28]

2019 , journal =

Evaluation of Classical Machine Learning Techniques towards Urban Sound Recognition on Embedded Systems , author =. 2019 , journal =. doi:10.3390/app9183885 , abstract =

work page doi:10.3390/app9183885 2019

[29] [29]

2018 , journal =

Ecological diversity: measuring the unmeasurable , author =. 2018 , journal =

work page 2018

[30] [30]

Tensorflow

David, Robert and Duke, Jared and Jain, Advait and Janapa Reddi, Vijay and Jeffries, Nat and Li, Jian and Kreeger, Nick and Nappier, Ian and Natraj, Meghna and Wang, Tiezhen and others , year =. Tensorflow. Proceedings of machine learning and systems , volume =

work page

[31] [31]

1980 , journal =

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences , author =. 1980 , journal =

work page 1980

[32] [32]

2025 , journal =

A Hybrid CNN-LSTM Model for Environmental Sound Classification: Leveraging Feature Engineering and Transfer Learning , author =. 2025 , journal =. doi:10.1016/j.dsp.2025.104079 , url =

work page doi:10.1016/j.dsp.2025.104079 2025

[33] [33]

Scientific Reports , publisher =

Fast Environmental Sound Classification Based on Resource Adaptive Convolutional Neural Network , author =. Scientific Reports , publisher =. 2022 , month =. doi:10.1038/s41598-022-10382-x , issn =

work page doi:10.1038/s41598-022-10382-x 2022

[34] [34]

2017 , journal =

Freesound Datasets: a Platform for the Creation of Open Audio Datasets , author =. 2017 , journal =

work page 2017

[35] [35]

General-purpose Tagging of

Fonseca, Eduardo and Plakal, Manoj and Font, Frederic and Ellis, Daniel P W and Favory, Xavier and Pons, Jordi and Serra, Xavier , year =. General-purpose Tagging of

work page

[36] [36]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , publisher =

Fonseca, Eduardo and Favory, Xavier and Pons, Jordi and Font, Frederic and Serra, Xavier , year =. IEEE/ACM Transactions on Audio, Speech, and Language Processing , publisher =

work page

[37] [37]

2020 , journal =

A state-of-the-art review on birds as indicators of biodiversity: Advances, challenges, and future directions , author =. 2020 , journal =

work page 2020

[38] [38]

2025 , journal =

Environmental Noise Dataset for Sound Event Classification and Detection , author =. 2025 , journal =. doi:10.1038/s41597-025-05991-w , issn =

work page doi:10.1038/s41597-025-05991-w 2025

[39] [39]

Artificial Intelligence Review , publisher =

Environmental sound recognition on embedded devices using deep learning: a review , author =. Artificial Intelligence Review , publisher =. 2025 , month =. doi:10.1007/s10462-025-11106-z , issn =

work page doi:10.1007/s10462-025-11106-z 2025

[40] [40]

2026 , booktitle =

Monitoring with Machines: A Review of Computational Bioacoustics , author =. 2026 , booktitle =. doi:10.1007/978-3-032-05821-8_16 , isbn =

work page doi:10.1007/978-3-032-05821-8_16 2026

[41] [41]

1909 , journal =

Concentration and dependency ratios , author =. 1909 , journal =

work page 1909

[42] [42]

Variabilit

Gini, Corrado , year =. Variabilit

work page

[43] [43]

Giorgi, Giovanni Maria and Gigliarano, Chiara , year =. The. Journal of Economic Surveys , publisher =

work page

[44] [44]

CLEF: Conference and Labs of the Evaluation Forum , address =

Goeau, Hervé and Glotin, Hervé and Vellinga, Willem-pier and Planque, Robert and Joly, Alexis , year =. CLEF: Conference and Labs of the Evaluation Forum , address =

work page

[45] [45]

arXiv preprint arXiv:2104.01778 , url =

Gong, Yuan and Chung, Yu-An and Glass, James , year =. arXiv preprint arXiv:2104.01778 , url =

work page arXiv

[46] [46]

25th European Signal Processing Conference, EUSIPCO 2017 , publisher =

Two Convolutional Neural Networks for Bird Detection in Audio Signals , author =. 25th European Signal Processing Conference, EUSIPCO 2017 , publisher =. 2017 , month =. doi:10.23919/EUSIPCO.2017.8081512 , isbn =

work page doi:10.23919/eusipco.2017.8081512 2017

[47] [47]

2021 , journal =

Comparing Recurrent Convolutional Neural Networks for Large Scale Bird Species Classification , author =. 2021 , journal =

work page 2021

[48] [48]

, year =

Han, Song and Mao, Huizi and Dally, William J. , year =. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and. ICLR 2016 , url =

work page 2016

[49] [49]

2009 , journal =

Learning from imbalanced data , author =. 2009 , journal =

work page 2009

[50] [50]

2016 , booktitle =

Deep Residual Learning for Image Recognition , author =. 2016 , booktitle =

work page 2016

[51] [51]

2021 , journal =

Addressing class imbalance in image-based biodiversity monitoring , author =. 2021 , journal =

work page 2021

[52] [52]

HardwareX , publisher =

Hill, Andrew P and Prince, Peter and Snaddon, Jake L and Doncaster, C Patrick and Rogers, Alex , year =. HardwareX , publisher =. doi:10.1016/j.ohx.2019.e00073 , issn =

work page doi:10.1016/j.ohx.2019.e00073 2019

[53] [53]

2015 , month =

Distilling the Knowledge in a Neural Network , author =. 2015 , month =

work page 2015

[54] [54]

and Farwig, Nina and Freisleben, Bernd , year =

Hoechst, Jonas and Bellafkir, Hicham and Lampe, Patrick and Vogelbacher, Markus and Muhling, Markus and Schneider, Daniel and Lindner, Kim and Rosner, Sascha and Schabo, Dana G. and Farwig, Nina and Freisleben, Bernd , year =. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatic...

work page doi:10.1007/978-3-031-17436-0

[55] [55]

2025 , journal =

Essential Steps for Establishing a Large-scale Passive Acoustic Monitoring for an Elusive Forest Bird Species: The Eurasian Woodcock (Scolopax rusticola) , author =. 2025 , journal =

work page 2025

[56] [56]

Howard and Menglong Zhu and Bo Chen and Dmitry Kalenichenko and Weijun Wang and Tobias Weyand and Marco Andreetto and Hartwig Adam , year =

Andrew G. Howard and Menglong Zhu and Bo Chen and Dmitry Kalenichenko and Weijun Wang and Tobias Weyand and Marco Andreetto and Hartwig Adam , year =. CoRR , volume =

work page

[57] [57]

Le and Hartwig Adam , year =

Andrew Howard and Mark Sandler and Grace Chu and Liang-Chieh Chen and Bo Chen and Mingxing Tan and Weijun Wang and Yukun Zhu and Ruoming Pang and Vijay Vasudevan and Quoc V. Le and Hartwig Adam , year =. Searching for. The IEEE International Conference on Computer Vision (ICCV) , pages =. doi:10.1109/ICCV.2019.00140 , url =

work page doi:10.1109/iccv.2019.00140 2019

[58] [58]

2018 , booktitle =

Squeeze-and-Excitation Networks , author =. 2018 , booktitle =

work page 2018

[59] [59]

Deep Learning Bird Song Recognition Based on

Hu, Shipeng and Chu, Yihang and Wen, Zhifang and Zhou, Guoxiong and Sun, Yurong and Chen, Aibin , year =. Deep Learning Bird Song Recognition Based on. Ecological Indicators , publisher =. doi:10.1016/j.ecolind.2023.110844 , issn =

work page doi:10.1016/j.ecolind.2023.110844 2023

[60] [60]

2024 , booktitle =

Huang, Zhaolan and Tousnakhoff, Adrien and Kozyr, Polina and Rehausen, Roman and Bie. 2024 , booktitle =

work page 2024

[61] [61]

2021 , journal =

Sampling biases shape our view of the natural world , author =. 2021 , journal =. doi:https://doi.org/10.1111/ecog.05926 , keywords =. https://nsojournals.onlinelibrary.wiley.com/doi/pdf/10.1111/ecog.05926 , abstract =

work page doi:10.1111/ecog.05926 2021

[62] [62]

2025 , journal =

Understanding the adequacy and representativeness of species distribution data , author =. 2025 , journal =

work page 2025

[63] [63]

and Bayne, Erin M

Huus, Jan and Kelly, Kevin G. and Bayne, Erin M. and Knight, Elly C. , year =. Ecological Informatics , publisher =. doi:10.1016/j.ecoinf.2025.103122 , issn =

work page doi:10.1016/j.ecoinf.2025.103122 2025

[64] [64]

2015 , journal =

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , author =. 2015 , journal =

work page 2015

[65] [65]

2022 , journal =

Sampling and Modelling Rare Species: Conceptual Guidelines for the Neglected Majority , author =. 2022 , journal =

work page 2022

[66] [66]

2024 , booktitle =

Weight Light, Hear Right: Heart Sound Classification With a Low-Complexity Model , author =. 2024 , booktitle =

work page 2024

[67] [67]

2002 , booktitle =

Music type classification by spectral contrast feature , author =. 2002 , booktitle =

work page 2002

[68] [68]

Billion-scale similarity search with

Johnson, Jeff and Douze, Matthijs and J. Billion-scale similarity search with. 2019 , journal =

work page 2019

[69] [69]

, year =

Jolles, Jolle W. , year =. Broad-Scale Applications of the. Methods in Ecology and Evolution , publisher =. doi:10.1111/2041-210X.13652 , issn =

work page doi:10.1111/2041-210x.13652 2041

[70] [70]

Overview of

Joly, Alexis and Go. Overview of. 2018 , booktitle =

work page 2018

[71] [71]

Recognizing Birds from Sound - The 2018

Kahl, Stefan and Wilhelm-Stein, Thomas and Klinck, Holger and Kowerko, Danny and Eibl, Maximilian , year =. Recognizing Birds from Sound - The 2018. arXiv , number =

work page 2018

[72] [72]

Ecological Informatics , publisher =

Kahl, Stefan and Wood, Connor M and Eibl, Maximilian and Klinck, Holger , year =. Ecological Informatics , publisher =

work page

[73] [73]

Overview of

Kahl, Stefan and Denton, Tom and Klinck, Holger and Glotin, Herv. Overview of. 2021 , booktitle =

work page 2021

[74] [74]

Ecological Indicators , publisher =

Automated detection of gunshots in tropical forests using convolutional neural networks , author =. Ecological Indicators , publisher =. 2022 , month =. doi:10.1016/j.ecolind.2022.109128 , issn =

work page doi:10.1016/j.ecolind.2022.109128 2022

[75] [75]

2025 , journal =

Automatic Detection for Bioacoustic Research: A Practical Guide From and for Biologists and Computer Scientists , author =. 2025 , journal =

work page 2025

[76] [76]

2020 , journal =

Animal Sounds Classification Scheme Based on Multi-Feature Network with Mixed Datasets , author =. 2020 , journal =

work page 2020

[77] [77]

2015 , booktitle =

Adam: A Method for Stochastic Optimization , author =. 2015 , booktitle =

work page 2015

[78] [78]

CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs

Liangzhen Lai and Naveen Suda and Vikas Chandra , year =. arXiv preprint arXiv:1801.06601 , publisher =

work page internal anchor Pith review Pith/arXiv arXiv

[79] [79]

A quantitative evaluation of the performance of the low-cost

Lapp, Sam and Stahlman, Nickolus and Kitzes, Justin , year =. A quantitative evaluation of the performance of the low-cost. Sensors , publisher =

work page

[80] [80]

2024 , journal =

Computational Bioacoustics and Automated Recognition of Bird Vocalizations: New Tools, Applications and Methods for Bird Monitoring , author =. 2024 , journal =

work page 2024