Paper ID | SPE-55.5 |
Paper Title |
PHONE DISTRIBUTION ESTIMATION FOR LOW RESOURCE LANGUAGES |
Authors |
Xinjian Li, Juncheng Li, Jiali Yao, Alan Black, Florian Metze, Carnegie Mellon University, United States |
Session | SPE-55: Language Identification and Low Resource Speech Recognition |
Location | Gather.Town |
Session Time: | Friday, 11 June, 14:00 - 14:45 |
Presentation Time: | Friday, 11 June, 14:00 - 14:45 |
Presentation |
Poster
|
Topic |
Speech Processing: [SPE-MULT] Multilingual Recognition and Identification |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
Phones are critical components in various computational linguistic fields, for example, phone distributions could be helpful in speech recognition and speech synthesis. Traditional approaches to estimate phone distributions typically involve G2P systems which are either manually designed by linguists or trained on large datasets. These prohibitive requirements make research on low resource languages extremely challenging. In this work, we propose a novel approach to estimate phone distributions by only requiring raw audio datasets: We first estimate the phone ranks by combining language-independent recognition results and Learning to Rank results. Next, we approximate the distribution with Expectation-Maximization by fitting Yule distribution. The results on 7 languages show the joint-model has better performance in both ranking estimation and distribution estimation tasks. |