Paper ID | HLT-15.1 |
Paper Title |
MISPRONUNCIATION DETECTION IN NON-NATIVE (L2) ENGLISH WITH UNCERTAINTY MODELING |
Authors |
Daniel Korzekwa, Jaime Lorenzo-Trueba, Amazon, Poland; Szymon Zaporowski, Gdansk University of Technology, Poland; Shira Calamaro, Thomas Drugman, Amazon, United States; Bozena Kostek, Gdansk University of Technology, Poland |
Session | HLT-15: Language Assessment |
Location | Gather.Town |
Session Time: | Thursday, 10 June, 16:30 - 17:15 |
Presentation Time: | Thursday, 10 June, 16:30 - 17:15 |
Presentation |
Poster
|
Topic |
Human Language Technology: [HLT-LACL] Language Acquisition and Learning |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
A common approach to the automatic detection of mispronunciation in language learning is to recognize the phonemes produced by a student and compare it to the expected pronunciation of a native speaker. This approach makes two simplifying assumptions: a) phonemes can be recognized from speech with high accuracy, b) there is a single correct way for a sentence to be pronounced. These assumptions do not always hold, which can result in a significant amount of false mispronunciation alarms. We propose a novel approach to overcome this problem based on two principles: a) taking into account uncertainty in the automatic phoneme recognition step, b) accounting for the fact that there may be multiple valid pronunciations. We evaluate the model on non-native (L2) English speech of German, Italian and Polish speakers, where it is shown to increase the precision of detecting mispronunciations by up to 18% (relative) compared to the common approach. |