SPE-51: Speech Enhancement 7: Single-channel Processing |
| Session Type: Poster |
| Time: Friday, 11 June, 13:00 - 13:45 |
| Location: Gather.Town |
| Virtual Session: View on Virtual Platform |
| Session Chair: Ann Spriet, GOODIX Technology Inc. |
| SPE-51.1: TSTNN: TWO-STAGE TRANSFORMER BASED NEURAL NETWORK FOR SPEECH ENHANCEMENT IN THE TIME DOMAIN |
| Kai Wang; Concordia University |
| Bengbeng He; Concordia University |
| Wei-Ping Zhu; Concordia University |
| SPE-51.2: SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT |
| Huy Phan; Queen Mary University of London |
| Huy Le Nguyen; Ho Chi Minh City University of Technology |
| Oliver Chén; University of Oxford |
| Philipp Koch; University of Lübeck |
| Ngoc Q. K.\ Duong; InterDigital R&D France |
| Ian McLoughlin; Singapore Institute of Technology |
| Alfred Mertins; University of Lübeck |
| SPE-51.3: NEURAL KALMAN FILTERING FOR SPEECH ENHANCEMENT |
| Wei Xue; JD AI Research |
| Gang Quan; JD AI Research |
| Chao Zhang; JD AI Research |
| Guohong Ding; JD AI Research |
| Xiaodong He; JD AI Research |
| Bowen Zhou; JD AI Research |
| SPE-51.4: NEURAL NOISE EMBEDDING FOR END-TO-END SPEECH ENHANCEMENT WITH CONDITIONAL LAYER NORMALIZATION |
| Zhihui Zhang; Wuhan University of Technology |
| Xiaoqi Li; Wuhan University of Technology |
| Yaxing Li; Wuhan University of Technology |
| Yuanjie Dong; Wuhan University of Technology |
| Dan Wang; Wuhan University of Technology |
| Shengwu Xiong; Wuhan University of Technology |
| SPE-51.5: PERCEPTUAL LOSS BASED SPEECH DENOISING WITH AN ENSEMBLE OF AUDIO PATTERN RECOGNITION AND SELF-SUPERVISED MODELS |
| Saurabh Kataria; Johns Hopkins University |
| Jesús Villalba; Johns Hopkins University |
| Najim Dehak; Johns Hopkins University |
| SPE-51.6: TOWARDS AN ASR APPROACH USING ACOUSTIC AND LANGUAGE MODELS FOR SPEECH ENHANCEMENT |
| Khandokar Md. Nayem; Indiana University |
| Donald S. Williamson; Indiana University |