2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

IEEE Signal Processing Society

Institute of Electrical and Electronics Engineers (IEEE)

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper ID	SPE-30.3
Paper Title	GENERATIVE SPEECH CODING WITH PREDICTIVE VARIANCE REGULARIZATION
Authors	W Bastiaan Kleijn, Victoria University of Wellington, New Zealand; Andrew Storus, Michael Chinen, Tom Denton, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh, Google, United States
Session	SPE-30: Speech Processing 2: General Topics
Location	Gather.Town
Session Time:	Wednesday, 09 June, 16:30 - 17:15
Presentation Time:	Wednesday, 09 June, 16:30 - 17:15
Presentation	Poster
Topic	Speech Processing: [SPE-CODI] Speech Coding
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	The recent emergence of machine-learning based generative models for speech suggests a significant reduction in bit rate for speech codecs is possible. However, the performance of generative models deteriorates significantly with the distortions present in real-world input signals. We argue that this deterioration is due to the sensitivity of the maximum likelihood criterion to outliers and the ineffectiveness of modeling a sum of independent signals with a single autoregressive model. We introduce predictive-variance regularization to reduce the sensitivity to outliers, resulting in a significant increase in performance. We show that noise reduction to remove unwanted signals can significantly increase performance. We provide extensive subjective performance evaluations that show that our system based on generative modeling provides state-of-the-art coding performance at 3 kb/s for real-world speech signals at reasonable computational complexity