Acoustic Scene Classification Using Fusion of Attentive Convolutional Neural Networks for DCASE2019 Challenge

arxiv: 1907.07127 · v1 · pith:CHCXDEVInew · submitted 2019-07-13 · 📡 eess.AS · cs.SD

Acoustic Scene Classification Using Fusion of Attentive Convolutional Neural Networks for DCASE2019 Challenge

Hossein Zeinali , Luk\'a\v{s} Burget , Jan "Honza'' \v{C}ernock\'y This is my paper

classification 📡 eess.AS cs.SD

keywords fusiondifferentnetworknetworksacousticcalledchallengeclassification

0 comments p. Extension

pith:CHCXDEVI Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{CHCXDEVI}

Prints a linked pith:CHCXDEVI badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

In this report, the Brno University of Technology (BUT) team submissions for Task 1 (Acoustic Scene Classification, ASC) of the DCASE-2019 challenge are described. Also, the analysis of different methods is provided. The proposed approach is a fusion of three different Convolutional Neural Network (CNN) topologies. The first one is a VGG like two-dimensional CNNs. The second one is again a two-dimensional CNN network which uses Max-Feature-Map activation and called Light-CNN (LCNN). The third network is a one-dimensional CNN which mainly used for speaker verification and called x-vector topology. All proposed networks use self-attention mechanism for statistic pooling. As a feature, we use a 256-dimensional log Mel-spectrogram. Our submissions are a fusion of several networks trained on 4-folds generated evaluation setup using different fusion strategies.

This paper has not been read by Pith yet.

Acoustic Scene Classification Using Fusion of Attentive Convolutional Neural Networks for DCASE2019 Challenge

discussion (0)