PteroSet is a new strongly annotated dataset of 563 tropical bird recordings (73.62 h) containing 15,372 time-frequency labels for 168 species, released in COCO-style JSON with a binary bird detection baseline.
arXiv preprint arXiv:2312.07439 , year=
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Compact binary hypercube embeddings enable efficient text-to-image and text-to-audio retrieval in wildlife databases with performance competitive to continuous embeddings but far lower memory and search costs.
Large empirical study finds self-supervised pre-training then supervised post-training on mixed bioacoustics and general audio data produces the strongest encoders across 26 datasets for species classification, detection, individual ID and repertoire discovery.
The paper introduces CoarseSoundNet, a deep learning model for classifying biophony, geophony, and anthropophony in passive acoustic monitoring recordings, reporting performance gains from additional similar data, a silence class, and decision thresholds, plus a case study on acoustic index trends.
citing papers explorer
-
A strongly annotated passive acoustic dataset for tropical bird monitoring
PteroSet is a new strongly annotated dataset of 563 tropical bird recordings (73.62 h) containing 15,372 time-frequency labels for 168 species, released in COCO-style JSON with a binary bird detection baseline.
-
Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval
Compact binary hypercube embeddings enable efficient text-to-image and text-to-audio retrieval in wildlife databases with performance competitive to continuous embeddings but far lower memory and search costs.
-
AVEX: What Matters for Animal Vocalization Encoding
Large empirical study finds self-supervised pre-training then supervised post-training on mixed bioacoustics and general audio data produces the strongest encoders across 26 datasets for species classification, detection, individual ID and repertoire discovery.
-
CoarseSoundNet: Building a reliable model for ecological soundscape analysis
The paper introduces CoarseSoundNet, a deep learning model for classifying biophony, geophony, and anthropophony in passive acoustic monitoring recordings, reporting performance gains from additional similar data, a silence class, and decision thresholds, plus a case study on acoustic index trends.