pith. sign in

arxiv: 2106.15659 · v1 · submitted 2021-06-29 · 📡 eess.AS · cs.SD

Towards a generalized monaural and binaural auditory model for psychoacoustics and speech intelligibility

Pith reviewed 2026-05-24 13:23 UTC · model grok-4.3

classification 📡 eess.AS cs.SD
keywords auditory modelbinaural processingpsychoacousticsmonaural cuesenvelope power spectrumspeech intelligibilityBMFDhearing perception
0
0 comments X

The pith

A non-adaptive binaural stage with five fixed channels extends the monaural generalized envelope power spectrum model for unified psychoacoustic predictions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish a single auditory model that can account for both monaural cues from one ear and binaural cues from differences between ears. It does this by adding a simplified binaural processing stage to an existing monaural model, resulting in a five-channel output that feeds the same decision backend. This matters because previous models often treated monaural and binaural experiments separately or used more complex adaptive mechanisms. If the approach works, it offers a more compact way to simulate human hearing performance across a range of listening conditions. The evaluation uses a database of psychoacoustic experiments from the literature to test the combined model.

Core claim

The paper claims that extending the monaural generalized envelope power spectrum model by a non-adaptive binaural stage with only a few fixed output channels, resembling features of physiologically motivated hemispheric binaural processing, yields a 5-channel monaural and binaural matrix feature decoder (BMFD). The existing model backend then calculates short-time envelope power and power features from this output for both monaural and binaural experiments.

What carries the argument

The 5-channel monaural and binaural matrix feature decoder (BMFD) produced by the non-adaptive binaural stage that combines monaural and binaural signals into fixed channels for the shared backend.

If this is right

  • The model applies the same short-time envelope power and power feature calculations to binaural data as to monaural data.
  • The approach avoids the need for signal-adaptive delays or high-dimensional multichannel outputs in the binaural stage.
  • The BMFD enables evaluation on a baseline database of both monaural and binaural psychoacoustic experiments using one unified decision stage.
  • This framework moves toward a generalized model applicable to psychoacoustics and speech intelligibility tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the fixed-channel approach works, it implies that many binaural effects can be captured without adaptive delays.
  • The unified backend could support extensions to speech intelligibility models as suggested by the paper title.
  • Testing the model on new binaural experiments involving varying interaural time differences would further validate the fixed-channel design.

Load-bearing premise

That the non-adaptive binaural stage with only a few fixed output channels is sufficient to capture the essential aspects of binaural processing for accurate unified predictions.

What would settle it

Demonstrating that the model systematically underperforms on binaural experiments known to require signal-adaptive delays or more detailed spatial processing would challenge the central claim.

Figures

Figures reproduced from arXiv: 2106.15659 by Stephan D. Ewert, Thomas Biberger.

Figure 1
Figure 1. Figure 1: Block diagram of the GPSM with BMFD extension. After peripheral processing, the left and right ear signals are binaurally processed by using the BMFD that provides two better-ear channels BEL and BER and three binaural interaction channels BIL, BIC, BIR. For each of the five BMFD outputs, envelope power and power SNRs are calculated in short-time frames and then combined across the five channels of the BMF… view at source ↗
Figure 2
Figure 2. Figure 2: Empirical data (filled symbols) and model predictions (open symbols) for ITD thresholds in ms (upper panel) and IID thresholds in dB (lower panel). The lower panel of [PITH_FULL_IMAGE:figures/full_fig_p020_2.png] view at source ↗
Figure 5
Figure 5. Figure 5: Empirical data (filled symbols) and model predictions (open symbols) for N [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗
Figure 8
Figure 8. Figure 8: Response of the BIL and BIR channels as a function of IPD (left panel) and ILD (right panel) for a 500 Hz pure tone. Negative IPDs indicate left ear leading, while negative ILDs indicate right ear more intense. Note that for clarity, amplitude and phase jitter were turned off. The (hemispheric) net neural activation is only partly resembled with the current subtraction process of the half-wave rectified co… view at source ↗
Figure 1
Figure 1. Figure 1: Block diagram of the GPSM with BMFD extension. After peripheral processing, [PITH_FULL_IMAGE:figures/full_fig_p049_1.png] view at source ↗
Figure 5
Figure 5. Figure 5: Empirical data (filled symbols) and model predic [PITH_FULL_IMAGE:figures/full_fig_p053_5.png] view at source ↗
read the original abstract

Auditory perception involves cues in the monaural auditory pathways as well as binaural cues based on differences between the ears. So far auditory models have often focused on either monaural or binaural experiments in isolation. Although binaural models typically build upon stages of (existing) monaural models, only a few attempts have been made to extend a monaural model by a binaural stage using a unified decision stage for monaural and binaural cues. In such approaches, a typical prototype of binaural processing has been the classical equalization-cancelation mechanism, which either involves signal-adaptive delays and provides a single channel output or can be implemented with tapped delays providing a high-dimensional multichannel output. This contribution extends the (monaural) generalized envelope power spectrum model by a non-adaptive binaural stage with only a few, fixed output channels. The binaural stage resembles features of physiologically motivated hemispheric binaural processing, as simplified signal processing stages, yielding a 5-channel monaural and binaural matrix feature "decoder" (BMFD). The back end of the existing monaural model is applied to the 5-channel BMFD output and calculates short-time envelope power and power features. The model is evaluated and discussed for a baseline database of monaural and binaural psychoacoustic experiments from the literature.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript extends the monaural generalized envelope power spectrum model (G-EPSM) by adding a non-adaptive binaural stage that produces a 5-channel output matrix (BMFD) inspired by simplified hemispheric processing. The existing monaural backend is then applied to this matrix to extract short-time envelope power and power features, enabling a single decision stage for both monaural and binaural psychoacoustic data. The model is evaluated on a baseline database of monaural and binaural experiments drawn from the literature.

Significance. A successful unification of monaural and binaural processing within one low-dimensional, non-adaptive front-end and shared backend would be a notable contribution to auditory modeling, offering a parsimonious alternative to adaptive equalization-cancellation or high-dimensional tapped-delay architectures. The approach could improve computational tractability for applications in psychoacoustics and speech intelligibility while aligning with certain physiological features of binaural pathways.

major comments (2)
  1. [Model description (binaural stage) and evaluation section] The central claim that the fixed 5-channel BMFD supplies sufficient binaural information for the unified G-EPSM backend rests on an untested assumption about informational completeness. No section provides a direct comparison of the 5-channel representation against the range of ITDs/ILDs or unmasking effects in the binaural subset of the database, nor against classical EC or tapped-delay outputs; this is load-bearing because the evaluation plan in the abstract hinges on the non-adaptive stage being adequate.
  2. [Evaluation and results] The paper does not report quantitative metrics (e.g., prediction error, correlation coefficients, or cross-validation results) that would allow assessment of whether the 5-channel output actually enables the monaural backend to match binaural data at a level comparable to dedicated binaural models. Without these, the claim of a unified decision stage cannot be verified.
minor comments (2)
  1. [Model equations] Clarify the exact mapping from the 5 BMFD channels to the existing G-EPSM feature extraction; the notation for the matrix output and how short-time features are computed across channels should be made explicit.
  2. The abstract states the model is 'evaluated and discussed' but the manuscript should include a table or figure summarizing prediction accuracy separately for monaural versus binaural conditions to support the unification claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments and for recognizing the potential significance of a unified low-dimensional model. We respond to each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Model description (binaural stage) and evaluation section] The central claim that the fixed 5-channel BMFD supplies sufficient binaural information for the unified G-EPSM backend rests on an untested assumption about informational completeness. No section provides a direct comparison of the 5-channel representation against the range of ITDs/ILDs or unmasking effects in the binaural subset of the database, nor against classical EC or tapped-delay outputs; this is load-bearing because the evaluation plan in the abstract hinges on the non-adaptive stage being adequate.

    Authors: The manuscript evaluates sufficiency indirectly by showing that the 5-channel BMFD, when processed by the existing monaural backend, produces predictions consistent with human data across the binaural experiments in the baseline database. This supports the claim for the tested conditions without requiring adaptive mechanisms. We agree, however, that an explicit comparison of BMFD channel outputs to ITD/ILD ranges and to EC or tapped-delay representations would strengthen the argument. We will add this analysis, including example channel responses for representative binaural stimuli, in a revised evaluation section. revision: yes

  2. Referee: [Evaluation and results] The paper does not report quantitative metrics (e.g., prediction error, correlation coefficients, or cross-validation results) that would allow assessment of whether the 5-channel output actually enables the monaural backend to match binaural data at a level comparable to dedicated binaural models. Without these, the claim of a unified decision stage cannot be verified.

    Authors: The current evaluation presents model predictions against literature data through figures and qualitative discussion of agreement. We concur that quantitative metrics would enable clearer verification and comparison to dedicated binaural models. In the revision we will report root-mean-square error and Pearson correlation coefficients between model predictions and experimental thresholds for both the monaural and binaural subsets, together with a brief cross-validation note on parameter stability. revision: yes

Circularity Check

0 steps flagged

No circularity: extension adds independent binaural stage to existing monaural model

full rationale

The provided abstract and description show the paper extending the prior monaural G-EPSM with a novel non-adaptive binaural stage (5-channel BMFD) that feeds into the existing backend. No equations, fitted parameters, or claims are presented that reduce any prediction or result to the inputs by construction, self-definition, or self-citation chains. The binaural stage is described as a new addition with fixed channels, and evaluation occurs on an external baseline database of experiments. This is a standard model-extension approach with independent content in the new stage; self-citation of the monaural base is normal and not load-bearing for circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The model introduces a new processing stage whose details rest on domain assumptions about auditory processing and a small number of design choices such as channel count.

free parameters (1)
  • number of BMFD output channels = 5
    Fixed at a small number to produce a 5-channel output; chosen to resemble hemispheric processing without adaptive mechanisms.
axioms (1)
  • domain assumption The existing monaural back-end can be applied directly to the multichannel BMFD output without modification.
    Stated in the description of applying the back end to calculate envelope power and power features from the 5-channel output.
invented entities (1)
  • BMFD (binaural monaural feature decoder) no independent evidence
    purpose: To provide a compact 5-channel representation combining monaural and binaural cues via simplified hemispheric processing.
    Newly introduced processing stage not present in the cited prior monaural model.

pith-pipeline@v0.9.0 · 5778 in / 1350 out tokens · 29177 ms · 2026-05-24T13:23:45.424749+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

103 extracted references · 103 canonical work pages

  1. [1]

    SNRenvWC,i(p,n) is then averaged across temporal segments i per modulation filter, resulting in a two-dimensional representation of envelope power SNRenv(p,n)

    Psychoacoustics In the first step, SNRenvW,i(p,n) in each of the five front end output channels are combined by taking the largest value for each time frame within each auditory and modulation channel resulting in SNRenvWC,i(p,n). SNRenvWC,i(p,n) is then averaged across temporal segments i per modulation filter, resulting in a two-dimensional representati...

  2. [2]

    also be expressed as sensitivity index d' = (2 ∙ SNR)1 2⁄ ≈ (0.5)1 2⁄

  3. [3]

    The overall SNR is converted to the sensitivity index d' by using equation (6) from [25] and finally transformed into percent correct responses

    Speech intelligibility The overall SNR is obtained by applying the same procedure as described for psychoacoustic predictions. The overall SNR is converted to the sensitivity index d' by using equation (6) from [25] and finally transformed into percent correct responses. E. Model configurations All model versions with binaural extension tested in this stu...

  4. [4]

    𝑚𝑠 2⁄ was varied in dB (10log𝑚𝑖𝑛𝑐). In Experiment 5 (AM detection) temporal modulation transfer functions (TMTF) for three narrow band noise carriers of 3, 31, and 314 Hz [5] and broadband noise carriers [22] were considered. The narrow band noise carriers were centered at 5 kHz and a sinusoidal AM of 3, 5, 10, 20, 30, 50, and 100 Hz was used. The narrow ...

  5. [5]

    For the monaural experiments stimuli were only provided to the left-ear input channel of the BMFD and the right-ear input channel was set to zero

    Monaural Experiments The upper part of Table 1 reports root-mean squared errors (RMSEs) and the coefficient of determination (R²) between experimental data and predictions based on BMFD, BIL,R, and the monaural mr-GPSM [26]. For the monaural experiments stimuli were only provided to the left-ear input channel of the BMFD and the right-ear input channel wa...

  6. [6]

    Binaural Experiments In Figures 2 – 6, subjective and predicted data for the binaural experiments are represented by closed and open symbols, respectively. The lower part of Table 1 reports root-mean square errors (RMSE) and the coefficient of determination (R²) between experimental data and predictions based on BMFD, BIL,C,R, and BIL,R. As illustrated in...

  7. [7]

    monaural

    The largest differences, up to about 9.5 dB, occur for signal frequencies below 500 Hz. BIL,R predictions (ope n circles) show a similar overall pattern to the data, and accordingly the predicted NπSm-N0Sm and NπS0-N0Sπ patterns largely agree with data. For NπSm and NπS0, both middle panels in Figure 3 show larger deviations between the data and the BIL,C...

  8. [8]

    specialist

    by setting the parameters k, q, m, 𝜎𝑠 in order to match the SSN data, which are shown in Table 2. Table 2 about here 28 Figure 7: The upper panel shows SRT50 results, while the lower panel shows the respective SRM. Data is represented by squares, while predictions are given by circles, triangles, and diamonds, respectively. The spatially co-located (front...

  9. [9]

    benchmark

    where SI prediction are either based on envelope power SNRs or power SNRs, this approach combines both types of SNRs. As shown in Figure 7, envelope power SNRs capture most of the measured SRM. It should be noted that predictions only based on power SNRs 34 also agree with the measured SRM pattern, but tend to overestimate measured SRM. For fluctuating ma...

  10. [10]

    D. S. Brungart, N. Iyer: Better-ear glimpsing efficiency with symmetrically-placed interfering talkers. J. Acoust. Soc. Am. 132 (2012) 2545-2556. Doi: 10.1121/1.4747005

  11. [11]

    S. D. Ewert, W. Schubotz, T. Brand, B. Kollmeier: Binaural masking release in symmetric listening conditions with spectro-temporally modulated maskers. J. Acoust. Soc. Am. 142 (2017) 12-28. Doi: https://doi.org/10.1121/1.381578

  12. [12]

    Hirsh: The influence of interaural phase on interaural summation and inhibition

    I. Hirsh: The influence of interaural phase on interaural summation and inhibition. J. Acoust. Soc. Am. 20 (1948) 536-544. Doi: https://doi.org/10.1121/1.1916992

  13. [13]

    van de Par, A

    S. van de Par, A. Kohlrausch: Dependence of binaural masking level differences on center frequency, masker bandwidth and interaural parameters. J. Acoust. Soc. Am. 106 (1999) 1940-1947. Doi: https://doi.org/10.1121/1.427942

  14. [14]

    T. Dau, B. Kollmeier, A. Kohlrausch: Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers. J. Acoust. Soc. Am. 102 (1997) 2892-2905. Doi: https://doi.org/10.1121/1.420344

  15. [15]

    T. Dau, B. Kollmeier, A. Kohlrausch: Modeling auditory processing of amplitude modulation. II. Spectral and temporal integration. J. Acoust. Soc. Am. 102 (1997) 2906-2919. Doi: https://doi.org/10.1121/1.420345

  16. [16]

    S. D. Ewert, T. Dau: Characterizing frequency selectivity for envelope fluctuations. J. Acoust. Soc. Am. 108 (2000) 1181-1196. Doi: https://doi.org/10.1121/1.1288665

  17. [17]

    Breebaart, S

    J. Breebaart, S. van de Par, A. Kohlrausch: Binaural processing model based on contralateral inhibition. I. Model setup. J. Acoust. Soc. Am. 110 (2001) 1074-1088. Doi: https://doi.org/10.1121/1.1383297

  18. [18]

    Biberger, S

    T. Biberger, S. D. Ewert: Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility. J. Acoust. Soc. Am. 140 (2016) 1023-1038. doi: http://dx.doi.org/10.1121/1.4960574

  19. [19]

    B. C. J. Moore, C.-T. Tan: Development and validation of a method for predicting the perceived naturalness of sounds subjected to spectral distortion. J. Audio Eng. Soc. 52 (2004) 900-914

  20. [20]

    K. S. Rhebergen, N. J. Versfeld: A speech intelligibility index-based approach to predict the speech reception threshold for sentences in fluctuating noise for normal-hearing listeners. J. Acoust. Soc. Am. 117 (2005) 2181-2192. Doi: https://doi.org/10.1121/1.1861713

  21. [21]

    Beutelmann, T

    R. Beutelmann, T. Brand, B. Kollmeier: Revision, extension and evaluation of a binaural speech intelligibility model. J. Acoust. Soc. Am. 127 (2010) 2479-2497. Doi: https://doi.org/10.1121/1.3295575

  22. [22]

    Lavandier, J

    M. Lavandier, J. F. Culling: Prediction of binaural speech intelligibility against noise in rooms. J. Acoust. Soc. Am. 127 (2010) 387-399. Doi: https://doi.org/10.1121/1.3268612

  23. [23]

    A. H. Andersen, J. M. de Haan, Z.-H. Tan, J. Jensen: Predicting the intelligibility of noisy and non-linearly processed binaural speech. IEEE/ACM Transactions on speech, Audio and Language Processing. 24 (2016) 1908-1920. Doi: 10.1109/TASLP.2016.2588002

  24. [24]

    Fleßner, R

    J.-H. Fleßner, R. Huber, S. D. Ewert: Assessment and prediction of binaural aspects of audio quality. J. Audio Eng. Soc. 65 (2017) 929-942. Doi: https://doi.org/10.17743/jaes.2017.0037 41

  25. [25]

    Biberger, J.-H

    T. Biberger, J.-H. Fleßner, R. Huber, S. D. Ewert: An objective audio quality measure based on power and envelope power cues. J. Audio Eng. Soc. 66 (2018) 578-593. doi: https://doi.org/10.17743/jaes.2018.0031

  26. [26]

    R. D. Patterson, B. C. J. Moore: Auditory filters and excitation patterns as representations of frequency resolution, in Frequency selectivity in hearing Moore BCJ, Editor London, Academic Press. 1986

  27. [27]

    C. J. Plack, A. J. Oxenham: Basilar-membrane nonlinearity and the growth of forward masking. J. Acoust. Soc. Am. 103 (1998) 1598-1608. Doi: https://doi.org/10.1121/1.421294

  28. [28]

    Fleßner, T

    J.-H. Fleßner, T. Biberger, S. D. Ewert: Subjective and objective assessment of monaural and binaural aspects of audio quality. IEEE Transactions on Audio, Speech and Language Processing. 27 (2019) 1112-1125. Doi: https://doi.org/10.1109/TASLP.2019.2904850

  29. [29]

    Biberger, H

    T. Biberger, H. Schepker, F. Denk, S. D. Ewert: Instrumental quality predictions and analysis of auditory cues for algorithms in modern headphone technology. Trends in Hearing, 25 (2021) 1-22. doi: 10.1177/23312165211001219

  30. [30]

    Fletcher: Auditory patterns

    H. Fletcher: Auditory patterns. Reviews of Modern Physics 12 (1940) 47-65. Doi: https://doi.org/10.1103/RevModPhys.12.47

  31. [31]

    N. F. Viemeister: Temporal modulation transfer functions based upon modulation thresholds. J. Acoust. Soc. Am. 66 (1979) 1364-1380. Doi: https://doi.org/10.1121/1.383531

  32. [32]

    B. R. Glasberg, B. C. J. Moore: Development and evaluation of a model for predicting the audibility of time-varying sounds in the presence of background sounds. J. Audio Eng. Soc. 53 (2005) 906-918

  33. [33]

    M. L. Jepsen, S. D. Ewert, T. Dau: A computational model of human auditory signal processing and perception. J. Acoust. Soc. Am. 124 (2008) 422-438. Doi: https://doi.org/10.1121/1.2924135

  34. [34]

    Jørgensen, S

    S. Jørgensen, S. D. Ewert, T. Dau: A multi-resolution envelope-power based model for speech intelligibility. J. Acoust. Soc. Am. 134 (2013) 436–446. Doi: https://doi.org/10.1121/1.4807563

  35. [35]

    Biberger, S

    T. Biberger, S. D. Ewert: The role of short-time intensity and envelope power for speech intelligibility and psychoacoustic masking. J. Acoust. Soc. Am. 142 (2017) 1098-

  36. [36]

    doi: http://dx.doi.org/10.1121/1.4999059

  37. [37]

    L. A. Jeffress: A place theory of sound localization. J. Comp. Physiol. Psychol. 41 (1948) 35-39. Doi: 10.1037/h0061495

  38. [38]

    N. I. Durlach: Equalization and cancellation theory of binaural masking-level differences. J. Acoust. Soc. Am. 35 (1963) 1206-1218. Doi: https://doi.org/10.1121/1.1918675

  39. [39]

    Lindemann: Extension of a binaural cross-correlation model by contralateral inhibition

    W. Lindemann: Extension of a binaural cross-correlation model by contralateral inhibition. J. Acoust. Soc. Am. 80 (1986) 1608-1622. Doi: https://doi.org/10.1121/1.394325

  40. [40]

    R. M. Stern, G. D. Shear: Lateralization and detection of low-frequency binaural stimuli: Effects of distribution of internal delay. J. Acoust. Soc. Am. 100 (1996) 2278-2288. Doi: https://doi.org/10.1121/1.417937

  41. [41]

    L. R. Bernstein, C. Trahiotis: Enhancing interaural-delay-based extents of laterality at high frequencies by using ‘transposed stimuli’. J. Acoust. Soc. Am. 113 (2003) 3335-

  42. [42]

    Doi: https://doi.org/10.1121/1.1570431

  43. [43]

    L. R. Bernstein, C. Trahiotis: Lateralization produced by interaural temporal and intensitive disparities of high-frequency, raised-sine stimuli: Data and modeling. J. Acoust. Soc. Am. 131 (2012) 409-415. Doi: https://doi.org/10.1121/1.3662056 42

  44. [44]

    Dietz, S

    M. Dietz, S. D. Ewert, V. Hohmann, B. Kollmeier: Coding of temporally fluctuating interaural timing disparities in a binaural processing model based on phase differences. Brain Res. 1220 (2008) 234-245. Doi: 10.1016/j.brainres.2007.09.026

  45. [45]

    J. Klug, L. Schmors, G. Ashida, M. Dietz: Neural rate difference model can account for lateralization of high frequency stimuli. J. Acoust. Soc. Am. 148 (2020) 678-691. Doi: https://doi.org/10.1121/10.0001602

  46. [46]

    Doclo, S

    S. Doclo, S. Gannot, D. Marquardt, E. Hadad: Binaural speech processing with application to hearing devices, in Audio source separation and speech enhancement Vincent E, Virtanen T, Gannot S, Editors, Wiley. 2018. Doi: https://doi.org/10.1002/9781119279860.ch18

  47. [47]

    R. Wan, N. I. Durlach, H. S. Colburn: Application of a short-time version of the equalization-cancellation model to speech intelligibility experiments with speech maskers. J. Acoust. Soc. Am. 136 (2014) 768-776. Doi: https://doi.org/10.1121/1.4884767

  48. [48]

    Chabot-Leclerc, E

    A. Chabot-Leclerc, E. N. MacDonald, T. Dau: Predicting binaural speech intelligibility using the signal-to-noise ratio in the envelope power spectrum domain. J. Acoust. Soc. Am. 140 (2016) 192-205. Doi: https://doi.org/10.1121/1.4954254

  49. [49]

    Breebaart, S

    J. Breebaart, S. van de Par, A. Kohlrausch: Binaural processing model based on contralateral inhibition. II. Dependence on spectral parameters. J. Acoust. Soc. Am. 110 (2001) 1089-1104. Doi: https://doi.org/10.1121/1.1383298

  50. [50]

    Breebaart, S

    J. Breebaart, S. van de Par, A. Kohlrausch: Binaural processing model based on contralateral inhibition. III. Dependence on temporal parameters. J. Acoust. Soc. Am. 110 (2001) 1105-1117. Doi: https://doi.org/10.1121/1.1383299

  51. [51]

    P. M. Briley, A. M. Goman, A. Q. Summerfield: Physiological evidence for a midline spatial channel in human auditory cortex. J. Assoc. Res. Otolaryngol. 17 (2016) 331-

  52. [52]

    Doi: 10.1007/s10162-016-0571-y

  53. [53]

    Grothe, M

    B. Grothe, M. Pecka: The natural history of sound localization in mammals – a story of neuronal inhibition. Frontiers in Neural Circuits 8 (2014) 116. Doi: 10.3389/fncir.2014.00116

  54. [54]

    Pecka, A

    M. Pecka, A. Brand, O. Behrend, B. Grothe: Interaural time difference processing in the mammalian medial superior olive: The role of glycinergic inhibition. J. Neurosci. 28 (2008) 6914-6925. Doi: 10.1523/JNEUROSCI.1660-08.2008

  55. [55]

    Grothe, M

    B. Grothe, M. Pecka, D. McAlpine: Mechanisms of sound localization in mammals. Physiol. Rev. 90 (2010) 983-1012. Doi: https://doi.org/10.1152/physrev.00026.2009

  56. [56]

    Kortlang, M

    S. Kortlang, M. Mauermann, S. D. Ewert: Suprathreshold auditory processing deficits in noise: Effects of hearing loss and age. Hearing Research 331 (2016) 27-40. Doi: 10.1016/j.heares.2015.10.004

  57. [57]

    Paraouty, S

    N. Paraouty, S. D. Ewert, N. Wallaert, C. Lorenzi: Interactions between amplitude modulation and frequency modulation processing: Effects of age and hearing loss. J. Acoust. Soc. Am. 140 (2016) 121-131. Doi: https://doi.org/10.1121/1.4955078

  58. [58]

    Wallaert, B

    N. Wallaert, B. C. J. Moore, C. Lorenzi: Comparing the effects of age on amplitude modulation detection. J. Acoust. Soc. Am. 139 (2016) 3088-3096. Doi: https://doi.org/10.1121/1.4953019

  59. [59]

    Wallaert, B

    N. Wallaert, B. C. J. Moore, S. D. Ewert, C. Lorenzi: Sensorineural hearing loss enhances auditory sensitivity and temporal integration for amplitude modulation. J. Acoust. Soc. Am. 141 (2017) 971-980. Doi: https://doi.org/10.1121/1.4976080

  60. [60]

    S. D. Ewert, N. Paraouty, C. Lorenzi: A two-path model of auditory modulation detection using temporal fine structure and envelope cues. Eur J Neurosci. 51 (2018) 1265-1278. Doi: 10.1111/ejn.13846 43

  61. [61]

    S. D. Ewert: Defining the proper stimulus and its ecology - mammals, in The senses: A comprehensive reference Fritzsch B, Editor, Elsevier. 2020. Doi:10.1016/B978-0-12- 809324-5.24238-7

  62. [62]

    Part 7: Reference Threshold of hearing under free-field and diffuse-field listening conditions

    ISO 389-7: Acoustics-Reference Zero for the Calibration of Audiometric Equipment. Part 7: Reference Threshold of hearing under free-field and diffuse-field listening conditions. International Organization for Standardization. Geneva, Switzerland. 2005

  63. [63]

    B. C. J. Moore, B. R. Glasberg: Suggested formulae for calculating auditory filter bandwidth and excitation patterns. J. Acoust. Soc. Am. 74 (1983) 750-753. Doi: https://doi.org/10.1121/1.389861

  64. [64]

    Kohlrausch, R

    A. Kohlrausch, R. Fassel, T. Dau: The influence of carrier level and frequency on modulation and beat-detection thresholds for sinusoidal carriers. J. Acoust. Soc. Am. 108 (2000) 723-734. Doi: https://doi.org/10.1121/1.429605

  65. [65]

    B. C. J. Moore: An Introduction to the psychology of. Hearing. 4th Edition. London, Academic. 1997

  66. [66]

    J. L. Verhey, T. Dau, B. Kollmeier: Within-channel cues in comodulation masking release (CMR): Experiments and model predictions using a modulation-filterbank model. J. Acoust. Soc. Am. 106 (1999) 2733-2745. Doi: https://doi.org/10.1121/1.428101

  67. [67]

    W. P. Tanner, R. D. Sorkin: The Theory of signal detectability, in Foundation of modern auditory function Tobias JV, Editor New York, Academic. 1972

  68. [68]

    Acoustical Society of America, New York

    ANSI, 1997: S3.5, Methods for calculation of the speech intelligibility index (Standards Secreteriat. Acoustical Society of America, New York

  69. [69]

    A. J. M. Houtsma, N. I. Durlach, L. D. Braida: Intensity perception. XI. Experimental results on the relation of intensity resolution to loudness matching. J. Acoust. Soc. Am. 68 (1998) 807-813. Doi: https://doi.org/10.1121/1.384819

  70. [70]

    B. C. J. Moore, J. I. Alcántara, T. Dau: Masking patterns for sinusoidal and narrow-band noise maskers. J. Acoust. Soc. Am. 104 (1998) 1023-1038. Doi: https://doi.org/10.1121/1.423321

  71. [71]

    S. D. Ewert, T. Dau: External and internal limitations in amplitude-modulation processing. J. Acoust. Soc. Am. 116 (2004) 478-490. Doi: https://doi.org/10.1121/1.1737399

  72. [72]

    R. G. Klumpp, H. R. Eady: Some measurements of interaural time difference thresholds. J. Acoust. Soc. Am. 28 (1956) 859-860. Doi: https://doi.org/10.1121/1.1908493

  73. [73]

    Zwislocki, R

    J. Zwislocki, R. S. Feldman: Just noticeable differences in dichotic phase. J. Acoust. Soc. Am. 28 (1956) 860-864. Doi: https://doi.org/10.1121/1.1908495

  74. [74]

    Mills: Lateralization of high-frequency tones

    A. Mills: Lateralization of high-frequency tones. J. Acoust. Soc. Am. 32 (1960) 132-134. Doi: https://doi.org/10.1121/1.1907864

  75. [75]

    D. W. Grantham: Interaural intensity discrimination: insensitivity at 1000 Hz. J. Acoust. Soc. Am. 75 (1984) 1191-1194. Doi: https://doi.org/10.1121/1.390769

  76. [76]

    Hirsh, M

    I. Hirsh, M. Burgeat: Binaural effects in remote masking. J. Acoust. Soc. Am. 30 (1958) 827-832. Doi: https://doi.org/10.1121/1.1930084

  77. [77]

    Kohlrausch: Auditory filter shape derived from binaural masking experiments

    A. Kohlrausch: Auditory filter shape derived from binaural masking experiments. J. Acoust. Soc. Am. 84 (1988) 573-583. Doi: https://doi.org/10.1121/1.396835

  78. [78]

    W. A. Yost: Prior stimulation and the masking-level difference. J. Acoust. Soc. Am. 78 (1985) 901-906. Doi: https://doi.org/10.1121/1.392920

  79. [79]

    Wilson, C

    R. Wilson, C. Fowler: Effects of signal duration on the 500-Hz masking-level difference. Scand. Audiol. 15 (1986) 209-215. Doi: 0.3109/01050398609042145

  80. [80]

    Wilson, R

    R. Wilson, R. Fugleberg: Influence of signal duration on the masking-level difference. J. Speech Hear. Res. 30 (1987) 330-334. Doi: 10.1044/jshr.3003.330 44

Showing first 80 references.