pith. sign in

arxiv: 1609.09743 · v1 · pith:EAL6SCLInew · submitted 2016-09-30 · 💻 cs.SD · stat.AP

Rectified binaural ratio: A complex T-distributed feature for robust sound localization

classification 💻 cs.SD stat.AP
keywords binauralsoundnoiseratiosourcecomplexcorruptedfeature
0
0 comments X
read the original abstract

Most existing methods in binaural sound source localization rely on some kind of aggregation of phase-and level-difference cues in the time-frequency plane. While different ag-gregation schemes exist, they are often heuristic and suffer in adverse noise conditions. In this paper, we introduce the rectified binaural ratio as a new feature for sound source local-ization. We show that for Gaussian-process point source signals corrupted by stationary Gaussian noise, this ratio follows a complex t-distribution with explicit parameters. This new formulation provides a principled and statistically sound way to aggregate binaural features in the presence of noise. We subsequently derive two simple and efficient methods for robust relative transfer function and time-delay estimation. Experiments on heavily corrupted simulated and speech signals demonstrate the robustness of the proposed scheme.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.