SPE-44: Speech Recognition 16: Robust Speech Recognition 2 |
Session Type: Poster |
Time: Thursday, 10 June, 16:30 - 17:15 |
Location: Gather.Town |
Session Chair: Abdelrahman Mohamed, Facebook AI Research (FAIR)
|
|
SPE-44.1: AN INVESTIGATION OF END-TO-END MODELS FOR ROBUST SPEECH RECOGNITION |
Archiki Prasad; Indian Institute of Technology, Bombay |
Preethi Jyothi; Indian Institute of Technology, Bombay |
Rajbabu Velmurugan; Indian Institute of Technology, Bombay |
|
SPE-44.2: END-TO-END DEREVERBERATION, BEAMFORMING, AND SPEECH RECOGNITION WITH IMPROVED NUMERICAL STABILITY AND ADVANCED FRONTEND |
Wangyou Zhang; Shanghai Jiao Tong University |
Christoph Boeddeker; Paderborn University |
Shinji Watanabe; Johns Hopkins University |
Tomohiro Nakatani; NTT Corporation |
Marc Delcroix; NTT Corporation |
Keisuke Kinoshita; NTT Corporation |
Tsubasa Ochiai; NTT Corporation |
Naoyuki Kamo; NTT Corporation |
Reinhold Haeb-Umbach; Paderborn University |
Yanmin Qian; Shanghai Jiao Tong University |
|
SPE-44.3: STREAMING MULTI-SPEAKER ASR WITH RNN-T |
Ilya Sklyar; Amazon |
Anna Piunova; Amazon |
Yulan Liu; Amazon |
|
SPE-44.4: IMPROVING RNN TRANSDUCER WITH TARGET SPEAKER EXTRACTION AND NEURAL UNCERTAINTY ESTIMATION |
Jiatong Shi; The Johns Hopkins University |
Chunlei Zhang; Tencent AI Lab |
Chao Weng; Tencent AI Lab |
Shinji Watanabe; The Johns Hopkins University |
Meng Yu; Tencent AI Lab |
Dong Yu; Tencent AI Lab |
|
SPE-44.5: A PROGRESSIVE LEARNING APPROACH TO ADAPTIVE NOISE AND SPEECH ESTIMATION FOR SPEECH ENHANCEMENT AND NOISY SPEECH RECOGNITION |
Zhaoxu Nian; University of Science and Technology of China |
Yan-Hui Tu; University of Science and Technology of China |
Jun Du; University of Science and Technology of China |
Chin-Hui Lee; Georgia Institute of Technology |
|
SPE-44.6: THE ACCENTED ENGLISH SPEECH RECOGNITION CHALLENGE 2020: OPEN DATASETS, TRACKS, BASELINES, RESULTS AND METHODS |
Xian Shi; Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University |
Fan Yu; Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University |
Yizhou Lu; SpeechLab, Department of Computer Science and Engineering, Shanghai Jiao Tong University |
Yuhao Liang; Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University |
Qiangze Feng; Datatang (Beijing) Technology Co., LTD |
Daliang Wang; Datatang (Beijing) Technology Co., LTD |
Yanmin Qian; SpeechLab, Department of Computer Science and Engineering, Shanghai Jiao Tong University |
Lei Xie; Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University |
|