2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

IEEE Signal Processing Society

Institute of Electrical and Electronics Engineers (IEEE)

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper ID	SPE-18.3
Paper Title	CUE-PRESERVING MMSE FILTER WITH BAYESIAN SNR MARGINALIZATION FOR BINAURAL SPEECH ENHANCEMENT
Authors	Stefan Thaleiser, Gerald Enzner, Ruhr-Universität Bochum, Germany
Session	SPE-18: Speech Enhancement 4: Multi-channel Processing
Location	Gather.Town
Session Time:	Wednesday, 09 June, 14:00 - 14:45
Presentation Time:	Wednesday, 09 June, 14:00 - 14:45
Presentation	Poster
Topic	Speech Processing: [SPE-ENHA] Speech Enhancement and Separation
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	Binaural speech enhancement has often suffered from the trade-off between noise reduction and spatial cue preservation. The common-gain filtering of noisy speech under minimum mean-square error (MMSE) turned out as a viable approach, which resembles the format of Wiener-filtering spectral enhancement. Those techniques critically require the estimation of the local time-varying a-priori SNR. In single-channel approaches, it has been recently shown that local a-priori SNR can be marginalized in a Bayesian sense with an MMSE approach. In this paper, we translate the single-channel approach into a binaural Bayesian SNR marginalization, based on a binaural a-priori SNR definition and a related hyperprior. The overall MMSE solution then turns into a posterior expectation of an informed cue-preserving Wiener filter function, the computation of which is governed by binaural a-posteriori SNR and global SNR (i.e., the hyperprior mean). The resulting MMSE solution is thus easy to implement and performance consistently stands at the top of our evaluation by segmental SNR, PESQ, and STOI computational metrics.