pith. machine review for the scientific record. sign in

arxiv: 1905.06286 · v2 · submitted 2019-05-15 · 💻 cs.SD · cs.LG· eess.AS

Recognition: unknown

End-to-End Multi-Channel Speech Separation

Authors on Pith no claims yet
classification 💻 cs.SD cs.LGeess.AS
keywords end-to-endseparationmulti-channelspeechapproacharchitecturelearnablemodel
0
0 comments X
read the original abstract

The end-to-end approach for single-channel speech separation has been studied recently and shown promising results. This paper extended the previous approach and proposed a new end-to-end model for multi-channel speech separation. The primary contributions of this work include 1) an integrated waveform-in waveform-out separation system in a single neural network architecture. 2) We reformulate the traditional short time Fourier transform (STFT) and inter-channel phase difference (IPD) as a function of time-domain convolution with a special kernel. 3) We further relaxed those fixed kernels to be learnable, so that the entire architecture becomes purely data-driven and can be trained from end-to-end. We demonstrate on the WSJ0 far-field speech separation task that, with the benefit of learnable spatial features, our proposed end-to-end multi-channel model significantly improved the performance of previous end-to-end single-channel method and traditional multi-channel methods.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.