pith. sign in

arxiv: 1906.08847 · v1 · pith:WQ6GDV32new · submitted 2019-06-20 · 📡 eess.AS · cs.SD

A Signal Subspace Rotation Method for Localization of Multiple Wideband Sound Sources

Pith reviewed 2026-05-25 18:47 UTC · model grok-4.3

classification 📡 eess.AS cs.SD
keywords wideband sound source localizationdirection of arrival estimationsignal subspace rotationinter-channel phase differenceseigenvector domainmicrophone arraysmultiple sources
0
0 comments X

The pith

Rotating the signal subspace in the eigenvector domain normalizes narrowband statistics for wideband DOA estimation of multiple sound sources.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses extending narrowband multichannel sound source localization to wideband signals. It proposes rotating the estimated signal subspace to the wideband counterpart in the eigenvector domain. This normalizes the narrowband signal statistics so that normalized inter-channel phase differences can be estimated. Wideband direction-of-arrival estimates then follow from those normalized statistics. The approach uses less computation than applying narrowband methods to each frequency separately and needs no extra prior knowledge.

Core claim

The proposed algorithm normalizes the narrowband signal statistics by rotating the estimated signal subspace to the wideband counterpart in the eigenvector domain. Then the wideband DOA estimate can be obtained by estimating the normalized IPD from these wideband signal statistics. In addition to requiring less computational complexity compared to repeating the narrowband algorithms for all relevant frequencies of wideband signals, the proposed method also does not require any additional prior knowledge. The experimental results demonstrate the efficacy and the robustness of the proposed method.

What carries the argument

signal subspace rotation in the eigenvector domain, which aligns narrowband statistics with their wideband equivalents to preserve IPD information

If this is right

  • Wideband DOA estimates follow from normalized IPD derived after the subspace rotation.
  • Computational cost stays lower than repeating narrowband algorithms across all frequencies.
  • No additional prior knowledge beyond the narrowband method is required.
  • The method applies to localization of multiple wideband sound sources.
  • Experimental results indicate the approach is effective and robust.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The rotation step could reduce latency in real-time microphone-array applications on embedded hardware.
  • Similar eigenvector-domain alignment might apply to other wideband array processing tasks that rely on phase.
  • Further tests on reverberant rooms or moving sources would reveal whether the preserved IPD remains sufficient.
  • Combining the rotation with existing narrowband subspace trackers could yield hybrid wideband trackers.

Load-bearing premise

That rotating the estimated signal subspace in the eigenvector domain produces accurate normalized wideband statistics that preserve the necessary inter-channel phase information without distortion.

What would settle it

If direct wideband processing or known ground-truth positions yield DOA estimates that differ substantially from those obtained via the rotated narrowband subspaces, the normalization step would be shown to lose required phase information.

read the original abstract

In this paper, the problem of extending narrowband multichannel sound source localization algorithms to the wideband case is addressed. The DOA estimation of narrowband algorithms is based on the estimate of inter-channel phase differences (IPD) between microphones of the sound sources. A new method for wideband sound source DOA estimation based on signal subspace rotation is present. The proposed algorithm normalizes the narrowband signal statistics by rotating the estimated signal subspace to the wideband counterpart in the eigenvector domain. Then the wideband DOA estimate can be obtained by estimating the normalized IPD from these wideband signal statistics. In addition to requiring less computational complexity compared to repeating the narrowband algorithms for all relevant frequencies of wideband signals, the proposed method also does not require any additional prior knowledge. The experimental results demonstrate the efficacy and the robustness of the proposed method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes extending narrowband DOA estimation to wideband sound sources via a signal subspace rotation method. Narrowband signal statistics are normalized by rotating the estimated signal subspace to its wideband counterpart in the eigenvector domain; wideband DOA is then recovered by estimating normalized inter-channel phase differences (IPD) from the resulting statistics. The approach is claimed to require less computation than applying narrowband methods across all frequencies and to need no additional prior knowledge, with experiments asserted to show efficacy and robustness.

Significance. If the rotation operation can be shown to preserve IPD without distortion for frequency-dependent array manifolds, the method would offer a lower-complexity alternative to incoherent or coherent wideband processing. The absence of any derivation, invariance argument, or quantitative validation in the manuscript, however, leaves the central claim unsupported.

major comments (2)
  1. [Abstract] Abstract: the central construction asserts that rotating the narrowband signal-subspace eigenvectors produces normalized wideband statistics from which IPD can be read off directly. No mapping, commutativity argument, or invariance proof is supplied showing that the rotated eigenvectors remain proportional to the wideband array manifold when the steering vectors vary with frequency; without this, the phase-extraction step is not guaranteed to recover correct wideband DOA.
  2. [Abstract] Abstract: the claim that experiments demonstrate efficacy and robustness is unsupported by any reported error metrics, comparison baselines, array geometry, frequency range, or SNR conditions. This absence makes it impossible to assess whether the method actually outperforms or matches existing wideband techniques.
minor comments (1)
  1. [Abstract] Abstract: 'is present' should read 'is presented'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below and will revise the manuscript to strengthen the theoretical justification and abstract presentation.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central construction asserts that rotating the narrowband signal-subspace eigenvectors produces normalized wideband statistics from which IPD can be read off directly. No mapping, commutativity argument, or invariance proof is supplied showing that the rotated eigenvectors remain proportional to the wideband array manifold when the steering vectors vary with frequency; without this, the phase-extraction step is not guaranteed to recover correct wideband DOA.

    Authors: The full manuscript (Section III) defines the rotation operator explicitly in the eigenvector domain to normalize the narrowband statistics to their wideband counterpart. We agree that an explicit invariance argument would improve clarity. In the revised version we will insert a short derivation showing that the rotation is unitary and commutes with the phase extraction in a manner that leaves the inter-channel phase differences unchanged, thereby guaranteeing that the extracted IPD corresponds to the wideband array manifold. This addition directly addresses the concern. revision: yes

  2. Referee: [Abstract] Abstract: the claim that experiments demonstrate efficacy and robustness is unsupported by any reported error metrics, comparison baselines, array geometry, frequency range, or SNR conditions. This absence makes it impossible to assess whether the method actually outperforms or matches existing wideband techniques.

    Authors: The experimental protocol, including array geometry, frequency bands, SNR ranges, RMSE metrics, and comparisons against incoherent and coherent baselines, is reported in detail in Section IV. The abstract provides only a high-level summary. We will revise the abstract to incorporate the key quantitative conditions and performance figures so that the efficacy claims are self-contained at the abstract level. revision: yes

Circularity Check

0 steps flagged

No circularity: method is a direct algorithmic construction without self-referential reduction

full rationale

The paper proposes a subspace rotation procedure to normalize narrowband statistics into wideband equivalents for IPD-based DOA estimation. No equations are shown that define the target wideband statistics in terms of the rotation operator itself, nor does any step fit a parameter on a data subset and then relabel the output as a prediction. The central claim is an explicit construction (rotate eigenvectors, extract normalized IPD) whose validity rests on the geometric properties of the rotation, not on a self-citation chain or a fitted quantity renamed as output. External validation via experiments is referenced, keeping the derivation self-contained rather than tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities; all fields left empty due to lack of detail.

pith-pipeline@v0.9.0 · 5679 in / 1023 out tokens · 24370 ms · 2026-05-25T18:47:46.723136+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    Traditional sound source localization algorit hms such as GCC-PHA T [3] estimates the time delay of arrival (TDOA) between a pair of microphones to localize a single source

    INTRODUCTION Sound source localization is an important component in many multichannel signal processing systems aiming, e.g., at source tracking, signal separation, enhancement and noise suppre s- sion [1, 2] . Traditional sound source localization algorit hms such as GCC-PHA T [3] estimates the time delay of arrival (TDOA) between a pair of microphones t...

  2. [2]

    We use P mi- crophones ( P > Q ) that form a linear array with a uniform spacing ∆d and with mutually uncorrelated zero-mean sensor noise

    SIGNAL MODEL We assume a microphone array which fits the requirements for using ESPRIT, i.e., to localize Q sources. We use P mi- crophones ( P > Q ) that form a linear array with a uniform spacing ∆d and with mutually uncorrelated zero-mean sensor noise. The environment is assumed with free-field and far- field conditions. The source-microphone model in the...

  3. [3]

    In case of a single source scenario, for each narrowband component, the proposed approach rotates the eigenvectors that span th e signal subspaces by normalizing the IPD

    PROPOSED APPROACH The proposed approach considers a novel way for adapting ESPRIT to a single or multiple wideband sources. In case of a single source scenario, for each narrowband component, the proposed approach rotates the eigenvectors that span th e signal subspaces by normalizing the IPD. In case of multiple sources, it reconstructs covariance matric...

  4. [4]

    An iterative acc u- mulation method is proposed to solve this numerical sensiti v- ity problem

    IMPLEMENTA TION In the estimated signal subspace rotation step (8), when f gets larger, a finer quantization is required to limit the effect o f quantization errors on the DOA estimation. An iterative acc u- mulation method is proposed to solve this numerical sensiti v- ity problem. In each iteration, the estimated signal subspa ce from the i-th (i ∈ N+) f...

  5. [5]

    The evaluation includes comparisons to the narrowband ES- PRIT with the histogram method (hist-ESPRIT) [21] and to the CSS method [13]

    EV ALUA TION A set of experiments was performed in order to evaluate the performance of the algorithm using real-world recordings. The evaluation includes comparisons to the narrowband ES- PRIT with the histogram method (hist-ESPRIT) [21] and to the CSS method [13]. 5.1. Experimental setup The recordings were captured by a uniform linear array (ULA) with ...

  6. [6]

    CONCLUSION A wideband signal subspace DOA estimation approach is presented. The proposed signal subspace rotation method and the narrowband signal covariance matrix reconstructio n method are high and outperform the existing conventional approaches in computational complexity. Additionally, th e proposed approach avoids the necessity of additional prior k...

  7. [7]

    Multi- channel noise reduction for hands-free voice communi- cation on mobile phones,

    W . Jin, M. J. Taghizadeh, K. Chen, and W . Xiao, “Multi- channel noise reduction for hands-free voice communi- cation on mobile phones,” in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 2017, pp. 506–510

  8. [8]

    Multi- channel noise reduction with interference suppression on mobile phones,

    W . Jin, B. Desikan, A. Kumar, and K. Chen, “Multi- channel noise reduction with interference suppression on mobile phones,” in 2018 16th International W ork- shop on Acoustic Signal Enhancement (IWAENC) , Sep. 2018, pp. 201–205

  9. [9]

    The generalized correlation method for estimation of time delay,

    C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” Acoustics, Speech and Signal Processing, IEEE Transactions , vol. 24(4), pp. 320 – 327, 1976

  10. [10]

    J. H. DiBiase, A High-Accuracy, Low-Latency T ech- nique for T alker Localization in Reverberant Environ- ments Using Microphone Arrays , Ph.D. thesis, Brown University, 2000

  11. [11]

    Multiple emitter location and signal pa- rameter estimation,

    R. Schmidt, “Multiple emitter location and signal pa- rameter estimation,” Antennas and Propagation, IEEE Transactions, vol. 34(3), pp. 276 – 280, 1986

  12. [12]

    ESPRIT-estimation of signal parameters via rotational invariance techniques,

    R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance techniques,” Acous- tics, Speech and Signal Processing, IEEE Transactions , vol. 37(7), pp. 984–995, 1989

  13. [13]

    TDOA estimation for multiple sound sources in noisy and reverberant environments using broadband indepen- dent component analysis,

    A. Lombard, Y . Zheng, H. Buchner, and W . Kellermann, “TDOA estimation for multiple sound sources in noisy and reverberant environments using broadband indepen- dent component analysis,” IEEE Transactions on Audio, Speech, and Language Processing , vol. 19(6), pp. 1490 – 1503, 2011

  14. [14]

    Convolu- tive BSS of short mixtures by ICA recursively regular- ized across frequencies,

    F. Nesta, P . Svaizer, and M. Omologo, “Convolu- tive BSS of short mixtures by ICA recursively regular- ized across frequencies,” IEEE transactions on audio, speech, and language processing, vol. 19, no. 3, pp. 624 – 639, 2011

  15. [15]

    STFT bin selection for localization algorithms based on the spar- sity of speech signal spectra,

    A. Brendel, C. Huang, and W . Kellermann, “STFT bin selection for localization algorithms based on the spar- sity of speech signal spectra,” in European Congress and Exposition on Noise Control Engineering . IEEE, 2018, pp. 2561–2568

  16. [16]

    DOA estimation for multiple sparse sources with normalized observation vector clustering,

    S. Araki, H. Sawada, R. Mukai, and S. Makino, “DOA estimation for multiple sparse sources with normalized observation vector clustering,” in IEEE International Conference on Acoustics, Speech, and Signal Process- ing (ICASSP). IEEE, 2006, vol. 5

  17. [17]

    EB-ESPRIT: 2D lo- calization of multiple wideband acoustic sources using eigenbeams,

    H. Teutsch and W . Kellermann, “EB-ESPRIT: 2D lo- calization of multiple wideband acoustic sources using eigenbeams,” IEEE International Conference, vol. iii/89 - iii/92 V ol. 3(3), pp. 89–92, 2005

  18. [18]

    Coherent signals direction - of-arrival estimation using a spherical microphone ar- ray: Frequency smoothing approach,

    D. Khaykin and B. Rafaely, “Coherent signals direction - of-arrival estimation using a spherical microphone ar- ray: Frequency smoothing approach,” in Applications of Signal Processing to Audio and Acoustics, 2009. WAS- PAA ’09. IEEE W orkshop on, 2009, pp. 221–224

  19. [19]

    Coherent signal-subspace pro- cessing for the detection and estimation of angles of arrival of multiple wide-band sources,

    H. Wang and M. Kaveh, “Coherent signal-subspace pro- cessing for the detection and estimation of angles of arrival of multiple wide-band sources,” IEEE Trans- actions on Acoustics, Speech, and Signal Processing , 1985

  20. [20]

    Estimation of angles of arrivals of broadband signals,

    A. Shaw and R. Kumaresan, “Estimation of angles of arrivals of broadband signals,” in IEEE International Conference on Acoustics, Speech, and Signal Process- ing (ICASSP), 1987

  21. [21]

    Directions-of-arrival est i- mations of multiple coherent broadband signals,

    Y .-H. Chen and R.-H. Chen, “Directions-of-arrival est i- mations of multiple coherent broadband signals,” IEEE transactions on aerospace and electronic systems , vol. 29, no. 3, pp. 1035 – 1043, 1993

  22. [22]

    Focussing matrices for coher- ent signal-subspace processing,

    H. Hung and M. Kaveh, “Focussing matrices for coher- ent signal-subspace processing,” IEEE Transactions on Acoustics, Speech, and Signal Processing , vol. 36, no. 8, pp. 1272 – 1281, 1988

  23. [23]

    Direction-of-arrival es ti- mation for wide-band signals using the ESPRIT algo- rithm,

    B. Ottersten and T. Kailath, “Direction-of-arrival es ti- mation for wide-band signals using the ESPRIT algo- rithm,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 38, no. 2, pp. 317 – 327, 1990

  24. [24]

    Wideband multilinear array processing through tensor decomposi- tion,

    F. Raimondi, P . Comon, and O. Michel, “Wideband multilinear array processing through tensor decomposi- tion,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2016

  25. [25]

    Broadband DOA estimation using frequency invariant beamform- ing,

    D. B. Ward, Z. Ding, and R. A. Kennedy, “Broadband DOA estimation using frequency invariant beamform- ing,” IEEE Transactions on Signal Processing, 1998

  26. [26]

    Robust phase replication method for spatial aliasing problem in multiple sound sources lo- calization,

    K. Chen, J. T. Geiger, W . Jin, M. Taghizadeh, and W . Kellermann, “Robust phase replication method for spatial aliasing problem in multiple sound sources lo- calization,” in IEEE W orkshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017

  27. [27]

    Comparative performance of ESPRIT and MUSIC for direction-of- arrival estimation,

    R. Roy, A. Paulraj, and T. Kailath, “Comparative performance of ESPRIT and MUSIC for direction-of- arrival estimation,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , 1987

  28. [28]

    An audio-visual corpus for speech perception and automatic speech recognition,

    M. Cooke, J. Barker, S. Cunningham, and X. Shao, “An audio-visual corpus for speech perception and automatic speech recognition,” The Journal of the Acoustical So- ciety of America, vol. 120(5), pp. 2421 – 2424, 2006