2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDSPE-6.4
Paper Title SPEECH DEREVERBERATION USING VARIATIONAL AUTOENCODERS
Authors Deepak Baby, Amazon Alexa, Germany; Hervé Bourlard, Idiap Research Institute, Switzerland
SessionSPE-6: Speech Enhancement 2: Speech Separation and Dereverberation
LocationGather.Town
Session Time:Tuesday, 08 June, 14:00 - 14:45
Presentation Time:Tuesday, 08 June, 14:00 - 14:45
Presentation Poster
Topic Speech Processing: [SPE-ENHA] Speech Enhancement and Separation
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract This paper presents a statistical method for single-channel speech dereverberation using a variational autoencoder (VAE) for modelling the speech spectra. One popular approach for modelling speech spectra is to use non-negative matrix factorization (NMF) where learned clean speech spectral bases are used as a linear generative model for speech spectra. This work replaces this linear model with a powerful nonlinear deep generative model based on VAE. Further, this paper formulates a unified probabilistic generative model of reverberant speech based on Gaussian and Poisson distributions. We develop a Monte Carlo expectation-maximization algorithm for inferring the latent variables in the VAE and estimating the room impulse response for both probabilistic models. Evaluation results show the superiority of the proposed VAE-based models over the NMF-based counterparts.