A challenge submission system using expanded training data, feature-specific branches, and post-processing achieves up to 81.25% hierarchical F1 on BSD10k-v1.2.
A Multi-Branch Hierarchy-Aware Framework for Heterogeneous Audio Classification
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
This technical report describes our system for Task 1 of the DCASE 2026 Challenge, which aims to classify heterogeneous audio recordings according to the Broad Sound Taxonomy (BST). The task requires both accurate second-level prediction and consistency with the top-level taxonomy. Our system is built on CLAP-based audio-text representations and is improved along three strategies: expanding the training set with a filtered subset of BSD35k, enhancing acoustic modeling with feature-specific branches, and refining predictions using hierarchy-aware classifiers and KNN-based post-processing. Among the acoustic features considered, the log-STFT branch provides the strongest single-model performance. With KNN-based post-processing, our best single system achieves a hierarchical F1 score (Hier. F1) of 80.84% on the BSD10k-v1.2 set under the same evaluation protocol as the baseline. We further construct ensemble systems by combining models with complementary acoustic features and classification heads, achieving Hier. F1 scores of 81.25% and 81.18%, respectively.
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Multi-Branch Hierarchy-Aware Framework for Heterogeneous Audio Classification
A challenge submission system using expanded training data, feature-specific branches, and post-processing achieves up to 81.25% hierarchical F1 on BSD10k-v1.2.