Paper ID | SPE-27.5 |
Paper Title |
NEURAL UTTERANCE CONFIDENCE MEASURE FOR RNN-TRANSDUCERS AND TWO PASS MODELS |
Authors |
Ashutosh Gupta, Ankur Kumar, Samsung Research Institute, Bangelore, India; Dhananjaya Gowda, Kwangyoun Kim, Samsung Research Korea, South Korea; Sachin Singh, Samsung Bangalore, India; Shatrughan Singh, Samsung Research, India; Chanwoo Kim, Samsung Korea, South Korea |
Session | SPE-27: Speech Recognition 9: Confidence Measures |
Location | Gather.Town |
Session Time: | Wednesday, 09 June, 16:30 - 17:15 |
Presentation Time: | Wednesday, 09 June, 16:30 - 17:15 |
Presentation |
Poster
|
Topic |
Speech Processing: [SPE-GASR] General Topics in Speech Recognition |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
In this paper, we propose methods to compute confidence score on the predictions made by an end-to-end speech recognition model in a 2-pass framework. We use RNN-Transducer for a streaming model, and an attention-based decoder for the second pass model. We use neural technique to compute the confidence score, and experiment with various combinations of features from RNN-Transducer and second pass models.The neural confidence score model is trained as a binary classification task to accept or reject a prediction made by speech recognition model. The model is evaluated in a distributed speech recognition environment, and performs significantly better when features from second pass model are used as com-pared to the features from streaming model |