2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

SS-13: Recent Advances in Multichannel and Multimodal Machine Learning for Speech Applications

Session Type: Poster
Time: Thursday, 10 June, 16:30 - 17:15
Location: Gather.Town
Session Chairs: Ante Jukić, Apple and Ahmed Abdelaziz, Apple
 
SS-13.1: AN EMPIRICAL STUDY OF VISUAL FEATURES FOR DNN BASED AUDIO-VISUAL SPEECH ENHANCEMENT IN MULTI-TALKER ENVIRONMENTS
         Shrishti Saha Shetu; Fraunhofer IIS
         Soumitro Chakrabarty; Fraunhofer IIS
         Emanuël Habets; Fraunhofer IIS
 
SS-13.2: ON THE ROLE OF VISUAL CUES IN AUDIOVISUAL SPEECH ENHANCEMENT
         Zakaria Aldeneh; University of Michigan
         Anushree Prasanna Kumar; Apple
         Barry-John Theobald; Apple
         Erik Marchi; Apple
         Sachin Kajarekar; Apple
         Devang Naik; Apple
         Ahmed Hussen Abdelaziz; Apple
 
SS-13.3: CONVOLUTIVE TRANSFER FUNCTION INVARIANT SDR TRAINING CRITERIA FOR MULTI-CHANNEL REVERBERANT SPEECH SEPARATION
         Christoph Boeddeker; Paderborn University
         Wangyou Zhang; Shanghai Jiao Tong University
         Tomohiro Nakatani; NTT Corporation
         Keisuke Kinoshita; NTT Corporation
         Tsubasa Ochiai; NTT Corporation
         Marc Delcroix; NTT Corporation
         Naoyuki Kamo; NTT Corporation
         Yanmin Qian; Shanghai Jiao Tong University
         Reinhold Haeb-Umbach; Paderborn University
 
SS-13.4: DIRECTIONAL ASR: A NEW PARADIGM FOR E2E MULTI-SPEAKER SPEECH RECOGNITION WITH SOURCE LOCALIZATION
         Aswin Shanmugam Subramanian; Johns Hopkins University
         Chao Weng; Tencent AI Lab
         Shinji Watanabe; Johns Hopkins University
         Meng Yu; Tencent AI Lab
         Yong Xu; Tencent AI Lab
         Shi-Xiong Zhang; Tencent AI Lab
         Dong Yu; Tencent AI Lab
 
SS-13.5: COMMUNICATION-COST AWARE MICROPHONE SELECTION FOR NEURAL SPEECH ENHANCEMENT WITH AD-HOC MICROPHONE ARRAYS
         Jonah Casebeer; University of Illinois at Urbana-Champaign
         Jamshed Kaikaus; University of Illinois at Urbana-Champaign
         Paris Smaragdis; University of Illinois at Urbana-Champaign
 
SS-13.6: DEEP MULTI-FRAME MVDR FILTERING FOR SINGLE-MICROPHONE SPEECH ENHANCEMENT
         Marvin Tammen; University of Oldenburg
         Simon Doclo; University of Oldenburg