pith. sign in

arxiv: 1811.04419 · v1 · pith:OUE7MTOAnew · submitted 2018-11-11 · 💻 cs.SD · cs.MM· eess.AS

Multi-Temporal Resolution Convolutional Neural Networks for Acoustic Scene Classification

classification 💻 cs.SD cs.MMeess.AS
keywords acousticclassificationneuralresolutionscenearchitectureconvolutionalmodel
0
0 comments X
read the original abstract

In this paper we present a Deep Neural Network architecture for the task of acoustic scene classification which harnesses information from increasing temporal resolutions of Mel-Spectrogram segments. This architecture is composed of separated parallel Convolutional Neural Networks which learn spectral and temporal representations for each input resolution. The resolutions are chosen to cover fine-grained characteristics of a scene's spectral texture as well as its distribution of acoustic events. The proposed model shows a 3.56% absolute improvement of the best performing single resolution model and 12.49% of the DCASE 2017 Acoustic Scenes Classification task baseline.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.