Paper ID | AUD-19.3 |
Paper Title |
MAXIMUM A POSTERIORI ESTIMATOR FOR CONVOLUTIVE SOUND SOURCE SEPARATION WITH SUB-SOURCE BASED NTF MODEL AND THE LOCALIZATION PROBABILISTIC PRIOR ON THE MIXING MATRIX |
Authors |
Mieszko Fraś, Konrad Kowalczyk, AGH University of Science and Technology, Poland |
Session | AUD-19: Audio and Speech Source Separation 6: Topics in Source Separation |
Location | Gather.Town |
Session Time: | Thursday, 10 June, 13:00 - 13:45 |
Presentation Time: | Thursday, 10 June, 13:00 - 13:45 |
Presentation |
Poster
|
Topic |
Audio and Acoustic Signal Processing: [AUD-SEP] Audio and Speech Source Separation |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
In this paper we present a method for the separation of sound source signals recorded using multiple microphones in a reverberant room. In particular, we propose a maximum a posteriori (MAP) estimator based on the multichannel nonnegative tensor factorization (NTF) model with the localization prior distribution on the mixing matrix, in which the latent data consists of the so-called sub-sources for an improved performance in a reverberant environment. For the proposed MAP estimator, we derive the sub-source based expectation maximization (EM) algorithm with the multiplicative update rules (MU) and the localization prior distribution (LP) on the mixing matrix (SSEM-MU-LP). We then perform several experiments for speech and instrumental sound sources recorded using two microphones, in determined and under-determined scenarios, and with different types of initialization of the model parameters. The results of these experiments clearly indicate a significant improvement of the proposed algorithm with the localization prior over the state-of-the-art NTF-based source separation algorithms, which can reach up to $50\%$ in the signal-to-distortion ratio. |