SPE-51: Speech Enhancement 7: Single-channel Processing |
Session Type: Poster |
Time: Friday, 11 June, 13:00 - 13:45 |
Location: Gather.Town |
Virtual Session: View on Virtual Platform |
Session Chair: Ann Spriet, GOODIX Technology Inc. |
SPE-51.1: TSTNN: TWO-STAGE TRANSFORMER BASED NEURAL NETWORK FOR SPEECH ENHANCEMENT IN THE TIME DOMAIN |
Kai Wang; Concordia University |
Bengbeng He; Concordia University |
Wei-Ping Zhu; Concordia University |
SPE-51.2: SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT |
Huy Phan; Queen Mary University of London |
Huy Le Nguyen; Ho Chi Minh City University of Technology |
Oliver Chén; University of Oxford |
Philipp Koch; University of Lübeck |
Ngoc Q. K.\ Duong; InterDigital R&D France |
Ian McLoughlin; Singapore Institute of Technology |
Alfred Mertins; University of Lübeck |
SPE-51.3: NEURAL KALMAN FILTERING FOR SPEECH ENHANCEMENT |
Wei Xue; JD AI Research |
Gang Quan; JD AI Research |
Chao Zhang; JD AI Research |
Guohong Ding; JD AI Research |
Xiaodong He; JD AI Research |
Bowen Zhou; JD AI Research |
SPE-51.4: NEURAL NOISE EMBEDDING FOR END-TO-END SPEECH ENHANCEMENT WITH CONDITIONAL LAYER NORMALIZATION |
Zhihui Zhang; Wuhan University of Technology |
Xiaoqi Li; Wuhan University of Technology |
Yaxing Li; Wuhan University of Technology |
Yuanjie Dong; Wuhan University of Technology |
Dan Wang; Wuhan University of Technology |
Shengwu Xiong; Wuhan University of Technology |
SPE-51.5: PERCEPTUAL LOSS BASED SPEECH DENOISING WITH AN ENSEMBLE OF AUDIO PATTERN RECOGNITION AND SELF-SUPERVISED MODELS |
Saurabh Kataria; Johns Hopkins University |
Jesús Villalba; Johns Hopkins University |
Najim Dehak; Johns Hopkins University |
SPE-51.6: TOWARDS AN ASR APPROACH USING ACOUSTIC AND LANGUAGE MODELS FOR SPEECH ENHANCEMENT |
Khandokar Md. Nayem; Indiana University |
Donald S. Williamson; Indiana University |