Sound Event Detection with Boundary-Aware Optimization and Inference

Arnoldas Jasonas; Camilla Clark; \c{C}a\u{g}da\c{s} Bilen; Chi Ian Tang; Cosmin Frateanu; Florian Schmid; Gerhard Widmer; Giacomo Ferroni; Juan Azcarreta Ortiz; Sanjeel Parekh

arxiv: 2601.04178 · v2 · pith:B37PCDLInew · submitted 2026-01-07 · 📡 eess.AS · cs.SD

Sound Event Detection with Boundary-Aware Optimization and Inference

Florian Schmid , Chi Ian Tang , Sanjeel Parekh , Vamsi Krishna Ithapu , Juan Azcarreta Ortiz , Giacomo Ferroni , Yijun Qian , Arnoldas Jasonas

show 4 more authors

Cosmin Frateanu Camilla Clark Gerhard Widmer \c{C}a\u{g}da\c{s} Bilen

This is my paper

classification 📡 eess.AS cs.SD

keywords eventdetectiontemporalmodelingapproachaudiosetboundary-awareinference

0 comments

read the original abstract

Temporal detection problems appear in many fields including time-series estimation, activity recognition and sound event detection (SED). In this work, we propose a new approach to temporal event modeling by explicitly modeling event onsets and offsets, and by introducing boundary-aware optimization and inference strategies that substantially enhance temporal event detection. The presented methodology incorporates new temporal modeling layers - Recurrent Event Detection (RED) and Event Proposal Network (EPN) - which, together with tailored loss functions, enable more effective and precise temporal event detection. We evaluate the proposed method in the SED domain using a subset of the temporally-strongly annotated portion of AudioSet. Experimental results show that our approach not only outperforms traditional frame-wise SED models with state-of-the-art post-processing, but also removes the need for post-processing hyperparameter tuning, and scales to achieve new state-of-the-art performance across all AudioSet Strong classes.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Executable Boundary Contracts for Sound Event Traces
cs.LO 2026-05 unverdicted novelty 6.0 partial

Defines executable boundary contracts for sound event traces using an STL-embeddable Boolean fragment plus interval and duration clauses, then evaluates them on speech and soundscape data where they disagree with stan...