Generative augmentation of room impulse responses using FastRIR, filtered for alignment with challenge data, reduces speaker distance estimation MAE from 1.66m to 0.6m on GWA rooms and from 2.18m to 0.69m on Treble rooms.
Cstr vctk corpus: English multi-speaker cor- pus for cstr voice cloning toolkit (version 0.92)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Improving Speaker Distance Estimation through Generative Impulse Response Augmentation
Generative augmentation of room impulse responses using FastRIR, filtered for alignment with challenge data, reduces speaker distance estimation MAE from 1.66m to 0.6m on GWA rooms and from 2.18m to 0.69m on Treble rooms.