Paper ID | SPE-13.1 | ||
Paper Title | META-LEARNING FOR IMPROVING RARE WORD RECOGNITION IN END-TO-END ASR | ||
Authors | Florian Lux, Ngoc Thang Vu, University of Stuttgart, Germany | ||
Session | SPE-13: Speech Recognition 5: New Algorithms | ||
Location | Gather.Town | ||
Session Time: | Wednesday, 09 June, 13:00 - 13:45 | ||
Presentation Time: | Wednesday, 09 June, 13:00 - 13:45 | ||
Presentation | Poster | ||
Topic | Speech Processing: [SPE-GASR] General Topics in Speech Recognition | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | In this work we take on the challenge of rare word recognition in end-to-end (E2E) automatic speech recognition (ASR) by integrating a meta learning mechanism into an E2E ASR system, enabling few-shot adaptation. We propose a novel method of generating embeddings for speech, changes to four meta learning approaches, enabling them to perform keyword spotting and an approach to using their outcomes in an E2E ASR system. We verify the functionality of each of our three contributions in two experiments exploring their performance for different amounts of classes (N-way) and examples per class (k-shot) in a few-shot setting. We find that the information encoded in the speech embeddings suffices to allow the modified meta learning approaches to perform continuous signal spotting. Despite the simplicity of the interface between keyword spotting and speech recognition, we are able to consistently improve word error rate by up to 5%. |