A Signal Subspace Rotation Method for Localization of Multiple Wideband Sound Sources
Pith reviewed 2026-05-25 18:47 UTC · model grok-4.3
The pith
Rotating the signal subspace in the eigenvector domain normalizes narrowband statistics for wideband DOA estimation of multiple sound sources.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed algorithm normalizes the narrowband signal statistics by rotating the estimated signal subspace to the wideband counterpart in the eigenvector domain. Then the wideband DOA estimate can be obtained by estimating the normalized IPD from these wideband signal statistics. In addition to requiring less computational complexity compared to repeating the narrowband algorithms for all relevant frequencies of wideband signals, the proposed method also does not require any additional prior knowledge. The experimental results demonstrate the efficacy and the robustness of the proposed method.
What carries the argument
signal subspace rotation in the eigenvector domain, which aligns narrowband statistics with their wideband equivalents to preserve IPD information
If this is right
- Wideband DOA estimates follow from normalized IPD derived after the subspace rotation.
- Computational cost stays lower than repeating narrowband algorithms across all frequencies.
- No additional prior knowledge beyond the narrowband method is required.
- The method applies to localization of multiple wideband sound sources.
- Experimental results indicate the approach is effective and robust.
Where Pith is reading between the lines
- The rotation step could reduce latency in real-time microphone-array applications on embedded hardware.
- Similar eigenvector-domain alignment might apply to other wideband array processing tasks that rely on phase.
- Further tests on reverberant rooms or moving sources would reveal whether the preserved IPD remains sufficient.
- Combining the rotation with existing narrowband subspace trackers could yield hybrid wideband trackers.
Load-bearing premise
That rotating the estimated signal subspace in the eigenvector domain produces accurate normalized wideband statistics that preserve the necessary inter-channel phase information without distortion.
What would settle it
If direct wideband processing or known ground-truth positions yield DOA estimates that differ substantially from those obtained via the rotated narrowband subspaces, the normalization step would be shown to lose required phase information.
read the original abstract
In this paper, the problem of extending narrowband multichannel sound source localization algorithms to the wideband case is addressed. The DOA estimation of narrowband algorithms is based on the estimate of inter-channel phase differences (IPD) between microphones of the sound sources. A new method for wideband sound source DOA estimation based on signal subspace rotation is present. The proposed algorithm normalizes the narrowband signal statistics by rotating the estimated signal subspace to the wideband counterpart in the eigenvector domain. Then the wideband DOA estimate can be obtained by estimating the normalized IPD from these wideband signal statistics. In addition to requiring less computational complexity compared to repeating the narrowband algorithms for all relevant frequencies of wideband signals, the proposed method also does not require any additional prior knowledge. The experimental results demonstrate the efficacy and the robustness of the proposed method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes extending narrowband DOA estimation to wideband sound sources via a signal subspace rotation method. Narrowband signal statistics are normalized by rotating the estimated signal subspace to its wideband counterpart in the eigenvector domain; wideband DOA is then recovered by estimating normalized inter-channel phase differences (IPD) from the resulting statistics. The approach is claimed to require less computation than applying narrowband methods across all frequencies and to need no additional prior knowledge, with experiments asserted to show efficacy and robustness.
Significance. If the rotation operation can be shown to preserve IPD without distortion for frequency-dependent array manifolds, the method would offer a lower-complexity alternative to incoherent or coherent wideband processing. The absence of any derivation, invariance argument, or quantitative validation in the manuscript, however, leaves the central claim unsupported.
major comments (2)
- [Abstract] Abstract: the central construction asserts that rotating the narrowband signal-subspace eigenvectors produces normalized wideband statistics from which IPD can be read off directly. No mapping, commutativity argument, or invariance proof is supplied showing that the rotated eigenvectors remain proportional to the wideband array manifold when the steering vectors vary with frequency; without this, the phase-extraction step is not guaranteed to recover correct wideband DOA.
- [Abstract] Abstract: the claim that experiments demonstrate efficacy and robustness is unsupported by any reported error metrics, comparison baselines, array geometry, frequency range, or SNR conditions. This absence makes it impossible to assess whether the method actually outperforms or matches existing wideband techniques.
minor comments (1)
- [Abstract] Abstract: 'is present' should read 'is presented'.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major comment below and will revise the manuscript to strengthen the theoretical justification and abstract presentation.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central construction asserts that rotating the narrowband signal-subspace eigenvectors produces normalized wideband statistics from which IPD can be read off directly. No mapping, commutativity argument, or invariance proof is supplied showing that the rotated eigenvectors remain proportional to the wideband array manifold when the steering vectors vary with frequency; without this, the phase-extraction step is not guaranteed to recover correct wideband DOA.
Authors: The full manuscript (Section III) defines the rotation operator explicitly in the eigenvector domain to normalize the narrowband statistics to their wideband counterpart. We agree that an explicit invariance argument would improve clarity. In the revised version we will insert a short derivation showing that the rotation is unitary and commutes with the phase extraction in a manner that leaves the inter-channel phase differences unchanged, thereby guaranteeing that the extracted IPD corresponds to the wideband array manifold. This addition directly addresses the concern. revision: yes
-
Referee: [Abstract] Abstract: the claim that experiments demonstrate efficacy and robustness is unsupported by any reported error metrics, comparison baselines, array geometry, frequency range, or SNR conditions. This absence makes it impossible to assess whether the method actually outperforms or matches existing wideband techniques.
Authors: The experimental protocol, including array geometry, frequency bands, SNR ranges, RMSE metrics, and comparisons against incoherent and coherent baselines, is reported in detail in Section IV. The abstract provides only a high-level summary. We will revise the abstract to incorporate the key quantitative conditions and performance figures so that the efficacy claims are self-contained at the abstract level. revision: yes
Circularity Check
No circularity: method is a direct algorithmic construction without self-referential reduction
full rationale
The paper proposes a subspace rotation procedure to normalize narrowband statistics into wideband equivalents for IPD-based DOA estimation. No equations are shown that define the target wideband statistics in terms of the rotation operator itself, nor does any step fit a parameter on a data subset and then relabel the output as a prediction. The central claim is an explicit construction (rotate eigenvectors, extract normalized IPD) whose validity rests on the geometric properties of the rotation, not on a self-citation chain or a fitted quantity renamed as output. External validation via experiments is referenced, keeping the derivation self-contained rather than tautological.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
INTRODUCTION Sound source localization is an important component in many multichannel signal processing systems aiming, e.g., at source tracking, signal separation, enhancement and noise suppre s- sion [1, 2] . Traditional sound source localization algorit hms such as GCC-PHA T [3] estimates the time delay of arrival (TDOA) between a pair of microphones t...
-
[2]
SIGNAL MODEL We assume a microphone array which fits the requirements for using ESPRIT, i.e., to localize Q sources. We use P mi- crophones ( P > Q ) that form a linear array with a uniform spacing ∆d and with mutually uncorrelated zero-mean sensor noise. The environment is assumed with free-field and far- field conditions. The source-microphone model in the...
-
[3]
PROPOSED APPROACH The proposed approach considers a novel way for adapting ESPRIT to a single or multiple wideband sources. In case of a single source scenario, for each narrowband component, the proposed approach rotates the eigenvectors that span th e signal subspaces by normalizing the IPD. In case of multiple sources, it reconstructs covariance matric...
-
[4]
An iterative acc u- mulation method is proposed to solve this numerical sensiti v- ity problem
IMPLEMENTA TION In the estimated signal subspace rotation step (8), when f gets larger, a finer quantization is required to limit the effect o f quantization errors on the DOA estimation. An iterative acc u- mulation method is proposed to solve this numerical sensiti v- ity problem. In each iteration, the estimated signal subspa ce from the i-th (i ∈ N+) f...
-
[5]
EV ALUA TION A set of experiments was performed in order to evaluate the performance of the algorithm using real-world recordings. The evaluation includes comparisons to the narrowband ES- PRIT with the histogram method (hist-ESPRIT) [21] and to the CSS method [13]. 5.1. Experimental setup The recordings were captured by a uniform linear array (ULA) with ...
-
[6]
CONCLUSION A wideband signal subspace DOA estimation approach is presented. The proposed signal subspace rotation method and the narrowband signal covariance matrix reconstructio n method are high and outperform the existing conventional approaches in computational complexity. Additionally, th e proposed approach avoids the necessity of additional prior k...
-
[7]
Multi- channel noise reduction for hands-free voice communi- cation on mobile phones,
W . Jin, M. J. Taghizadeh, K. Chen, and W . Xiao, “Multi- channel noise reduction for hands-free voice communi- cation on mobile phones,” in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), March 2017, pp. 506–510
work page 2017
-
[8]
Multi- channel noise reduction with interference suppression on mobile phones,
W . Jin, B. Desikan, A. Kumar, and K. Chen, “Multi- channel noise reduction with interference suppression on mobile phones,” in 2018 16th International W ork- shop on Acoustic Signal Enhancement (IWAENC) , Sep. 2018, pp. 201–205
work page 2018
-
[9]
The generalized correlation method for estimation of time delay,
C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” Acoustics, Speech and Signal Processing, IEEE Transactions , vol. 24(4), pp. 320 – 327, 1976
work page 1976
-
[10]
J. H. DiBiase, A High-Accuracy, Low-Latency T ech- nique for T alker Localization in Reverberant Environ- ments Using Microphone Arrays , Ph.D. thesis, Brown University, 2000
work page 2000
-
[11]
Multiple emitter location and signal pa- rameter estimation,
R. Schmidt, “Multiple emitter location and signal pa- rameter estimation,” Antennas and Propagation, IEEE Transactions, vol. 34(3), pp. 276 – 280, 1986
work page 1986
-
[12]
ESPRIT-estimation of signal parameters via rotational invariance techniques,
R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance techniques,” Acous- tics, Speech and Signal Processing, IEEE Transactions , vol. 37(7), pp. 984–995, 1989
work page 1989
-
[13]
A. Lombard, Y . Zheng, H. Buchner, and W . Kellermann, “TDOA estimation for multiple sound sources in noisy and reverberant environments using broadband indepen- dent component analysis,” IEEE Transactions on Audio, Speech, and Language Processing , vol. 19(6), pp. 1490 – 1503, 2011
work page 2011
-
[14]
Convolu- tive BSS of short mixtures by ICA recursively regular- ized across frequencies,
F. Nesta, P . Svaizer, and M. Omologo, “Convolu- tive BSS of short mixtures by ICA recursively regular- ized across frequencies,” IEEE transactions on audio, speech, and language processing, vol. 19, no. 3, pp. 624 – 639, 2011
work page 2011
-
[15]
STFT bin selection for localization algorithms based on the spar- sity of speech signal spectra,
A. Brendel, C. Huang, and W . Kellermann, “STFT bin selection for localization algorithms based on the spar- sity of speech signal spectra,” in European Congress and Exposition on Noise Control Engineering . IEEE, 2018, pp. 2561–2568
work page 2018
-
[16]
DOA estimation for multiple sparse sources with normalized observation vector clustering,
S. Araki, H. Sawada, R. Mukai, and S. Makino, “DOA estimation for multiple sparse sources with normalized observation vector clustering,” in IEEE International Conference on Acoustics, Speech, and Signal Process- ing (ICASSP). IEEE, 2006, vol. 5
work page 2006
-
[17]
EB-ESPRIT: 2D lo- calization of multiple wideband acoustic sources using eigenbeams,
H. Teutsch and W . Kellermann, “EB-ESPRIT: 2D lo- calization of multiple wideband acoustic sources using eigenbeams,” IEEE International Conference, vol. iii/89 - iii/92 V ol. 3(3), pp. 89–92, 2005
work page 2005
-
[18]
D. Khaykin and B. Rafaely, “Coherent signals direction - of-arrival estimation using a spherical microphone ar- ray: Frequency smoothing approach,” in Applications of Signal Processing to Audio and Acoustics, 2009. WAS- PAA ’09. IEEE W orkshop on, 2009, pp. 221–224
work page 2009
-
[19]
H. Wang and M. Kaveh, “Coherent signal-subspace pro- cessing for the detection and estimation of angles of arrival of multiple wide-band sources,” IEEE Trans- actions on Acoustics, Speech, and Signal Processing , 1985
work page 1985
-
[20]
Estimation of angles of arrivals of broadband signals,
A. Shaw and R. Kumaresan, “Estimation of angles of arrivals of broadband signals,” in IEEE International Conference on Acoustics, Speech, and Signal Process- ing (ICASSP), 1987
work page 1987
-
[21]
Directions-of-arrival est i- mations of multiple coherent broadband signals,
Y .-H. Chen and R.-H. Chen, “Directions-of-arrival est i- mations of multiple coherent broadband signals,” IEEE transactions on aerospace and electronic systems , vol. 29, no. 3, pp. 1035 – 1043, 1993
work page 1993
-
[22]
Focussing matrices for coher- ent signal-subspace processing,
H. Hung and M. Kaveh, “Focussing matrices for coher- ent signal-subspace processing,” IEEE Transactions on Acoustics, Speech, and Signal Processing , vol. 36, no. 8, pp. 1272 – 1281, 1988
work page 1988
-
[23]
Direction-of-arrival es ti- mation for wide-band signals using the ESPRIT algo- rithm,
B. Ottersten and T. Kailath, “Direction-of-arrival es ti- mation for wide-band signals using the ESPRIT algo- rithm,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 38, no. 2, pp. 317 – 327, 1990
work page 1990
-
[24]
Wideband multilinear array processing through tensor decomposi- tion,
F. Raimondi, P . Comon, and O. Michel, “Wideband multilinear array processing through tensor decomposi- tion,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2016
work page 2016
-
[25]
Broadband DOA estimation using frequency invariant beamform- ing,
D. B. Ward, Z. Ding, and R. A. Kennedy, “Broadband DOA estimation using frequency invariant beamform- ing,” IEEE Transactions on Signal Processing, 1998
work page 1998
-
[26]
K. Chen, J. T. Geiger, W . Jin, M. Taghizadeh, and W . Kellermann, “Robust phase replication method for spatial aliasing problem in multiple sound sources lo- calization,” in IEEE W orkshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2017
work page 2017
-
[27]
Comparative performance of ESPRIT and MUSIC for direction-of- arrival estimation,
R. Roy, A. Paulraj, and T. Kailath, “Comparative performance of ESPRIT and MUSIC for direction-of- arrival estimation,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , 1987
work page 1987
-
[28]
An audio-visual corpus for speech perception and automatic speech recognition,
M. Cooke, J. Barker, S. Cunningham, and X. Shao, “An audio-visual corpus for speech perception and automatic speech recognition,” The Journal of the Acoustical So- ciety of America, vol. 120(5), pp. 2421 – 2424, 2006
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.