Paper ID | SS-13.5 | ||
Paper Title | COMMUNICATION-COST AWARE MICROPHONE SELECTION FOR NEURAL SPEECH ENHANCEMENT WITH AD-HOC MICROPHONE ARRAYS | ||
Authors | Jonah Casebeer, Jamshed Kaikaus, Paris Smaragdis, University of Illinois at Urbana-Champaign, United States | ||
Session | SS-13: Recent Advances in Multichannel and Multimodal Machine Learning for Speech Applications | ||
Location | Gather.Town | ||
Session Time: | Thursday, 10 June, 16:30 - 17:15 | ||
Presentation Time: | Thursday, 10 June, 16:30 - 17:15 | ||
Presentation | Poster | ||
Topic | Special Sessions: Recent Advances in Multichannel and Multimodal Machine Learning for Speech Applications | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | In this paper, we present a method for jointly-learning a microphone selection mechanism and a speech enhancement network for multi-channel speech enhancement with an ad-hoc microphone array. The attention-based microphone selection mechanism is trained to reduce communication-costs through a penalty term which represents a task-performance/ communication-cost trade-off. While working within the trade-off, our method can intelligently stream from more microphones in lower SNR scenes and fewer microphones in higher SNR scenes. We evaluate the model in complex echoic acoustic scenes with moving sources and show that it matches the performance of models that stream from a fixed number of microphones while reducing communication costs. |