SPE-21: Speech Recognition 7: Training Methods for End-to-End Modeling |
Session Type: Poster |
Time: Wednesday, 9 June, 15:30 - 16:15 |
Location: Gather.Town |
Virtual Session: View on Virtual Platform |
Session Chair: Karen Livescu, TTI-Chicago |
SPE-21.1: TOP-DOWN ATTENTION IN END-TO-END SPOKEN LANGUAGE UNDERSTANDING |
Yixin Chen; University of California, Los Angeles |
Weiyi Lu; Amazon Alexa |
Alejandro Mottini; Amazon Alexa |
Li Erran Li; Amazon Alexa |
Jasha Droppo; Amazon Alexa |
Zheng Du; Amazon Alexa |
Belinda Zeng; Amazon Alexa |
SPE-21.2: FINE-TUNING OF PRE-TRAINED END-TO-END SPEECH RECOGNITION WITH GENERATIVE ADVERSARIAL NETWORKS |
Md. Akmal Haidar; Huawei Noah's Ark Lab |
Mehdi Rezagholizadeh; Huawei Noah's Ark Lab |
SPE-21.3: A GENERAL MULTI-TASK LEARNING FRAMEWORK TO LEVERAGE TEXT DATA FOR SPEECH TO TEXT TASKS |
Yun Tang; Facebook |
Juan Pino; Facebook |
Changhan Wang; Facebook |
Xutai Ma; Johns Hopkins University |
Dmitriy Genzel; Facebook |
SPE-21.4: GAUSSIAN KERNELIZED SELF-ATTENTION FOR LONG SEQUENCE DATA AND ITS APPLICATION TO CTC-BASED SPEECH RECOGNITION |
Yosuke Kashiwagi; Sony Corporation |
Emiru Tsunoo; Sony Corporation |
Shinji Watanabe; Johns Hopkins University |
SPE-21.5: LATTICE-FREE MMI ADAPTATION OF SELF-SUPERVISED PRETRAINED ACOUSTIC MODELS |
Apoorv Vyas; Idiap Research Institute and EPFL |
Srikanth Madikeri; Idiap Research Institute |
Hervé Bourlard; Idiap Research Institute |
SPE-21.6: INTERMEDIATE LOSS REGULARIZATION FOR CTC-BASED SPEECH RECOGNITION |
Jaesong Lee; Naver Corporation |
Shinji Watanabe; Johns Hopkins University |