2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDSPE-58.4
Paper Title MULTI-TASK TRANSFORMER WITH INPUT FEATURE RECONSTRUCTION FOR DYSARTHRIC SPEECH RECOGNITION
Authors Chaoyue Ding, Shiliang Sun, Jing Zhao, East China Normal University, China
SessionSPE-58: Dysarthric Speech Processing
LocationGather.Town
Session Time:Friday, 11 June, 14:00 - 14:45
Presentation Time:Friday, 11 June, 14:00 - 14:45
Presentation Poster
Topic Speech Processing: [SPE-ANLS] Speech Analysis
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Dysarthria is a motor speech disorder caused by damage to the part of the nervous system that controls the physical production of speech. It poses great challenges in building robust dysarthric speech recognition (DSR) due to the high inter- and intra-speaker variability. To this end, we propose a multi-task Transformer with input feature reconstruction as an auxiliary task, where the main task of DSR and the auxiliary reconstruction task share the same encoder network. The auxiliary task aims to reconstruct clear speech features from corrupted speech of healthy speakers (intra-domain) or dysarthric speakers (cross-domain). Further, to alleviate the imbalanced distribution of dysarthria data sets, we devise an adaptive rebalance sampling scheme to improve the sampling frequency of dysarthric speech. Experimental results show that the proposed model considerably outperforms other baselines across speakers with varying severity of dysarthria.