2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDAUD-8.4
Paper Title JOINT DEREVERBERATION AND SEPARATION WITH ITERATIVE SOURCE STEERING
Authors Taishi Nakashima, Robin Scheibler, Masahito Togami, LINE Corporation, Japan; Nobutaka Ono, Tokyo Metropolitan University, Japan
SessionAUD-8: Audio and Speech Source Separation 4: Multi-Channel Source Separation
LocationGather.Town
Session Time:Wednesday, 09 June, 13:00 - 13:45
Presentation Time:Wednesday, 09 June, 13:00 - 13:45
Presentation Poster
Topic Audio and Acoustic Signal Processing: [AUD-SEP] Audio and Speech Source Separation
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract We propose a new algorithm for joint dereverberation and blind source separation (DR-BSS). Our work builds upon the IRLMA-T framework that applies a unified filter combining dereverberation and separation. One drawback of this framework is that it requires several matrix inversions, an operation inherently costly and with potential stability issues. We leverage the recently introduced iterative source steering (ISS) updates to propose two algorithms mitigating this issue. Albeit derived from first principles, the first algorithm turns out to be a natural combination of weighted prediction error (WPE) dereverberation and ISS-based BSS, applied alternatingly. In this case, we manage to reduce the number of matrix inversion to only one per iteration and source. The second algorithm updates the ILRMA-T matrix using only sequential ISS updates requiring no matrix inversion at all. Its implementation is straightforward and memory efficient. Numerical experiments demonstrate that both methods achieve the same final performance as ILRMA-T in terms of several relevant objective metrics. In the important case of two sources, the number of iterations required is also similar.