The Orchive : Data mining a massive bioacoustic archive

George Tzanetakis; Helena Symonds; Paul Spong; Steven Ness

arxiv: 1307.0589 · v1 · pith:JU6KURNKnew · submitted 2013-07-02 · 💻 cs.LG · cs.DB· cs.SD

The Orchive : Data mining a massive bioacoustic archive

Steven Ness , Helena Symonds , Paul Spong , George Tzanetakis This is my paper

classification 💻 cs.LG cs.DBcs.SD

keywords audioclassifiersorcaorchivebioacousticcallsdatarecordings

0 comments

read the original abstract

The Orchive is a large collection of over 20,000 hours of audio recordings from the OrcaLab research facility located off the northern tip of Vancouver Island. It contains recorded orca vocalizations from the 1980 to the present time and is one of the largest resources of bioacoustic data in the world. We have developed a web-based interface that allows researchers to listen to these recordings, view waveform and spectral representations of the audio, label clips with annotations, and view the results of machine learning classifiers based on automatic audio features extraction. In this paper we describe such classifiers that discriminate between background noise, orca calls, and the voice notes that are present in most of the tapes. Furthermore we show classification results for individual calls based on a previously existing orca call catalog. We have also experimentally investigated the scalability of classifiers over the entire Orchive.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

animal2vec and MeerKAT: A self-supervised transformer for rare-event raw audio input and a large-scale reference dataset for bioacoustics
cs.SD 2024-06 unverdicted novelty 6.0

Introduces animal2vec, a self-supervised transformer for sparse bioacoustic audio, and the MeerKAT meerkat vocalization dataset, claiming outperformance over baselines including in few-shot settings.