2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDSPE-57.5
Paper Title PAUSE-ENCODED LANGUAGE MODELS FOR RECOGNITION OF ALZHEIMER'S DISEASE AND EMOTION
Authors Jiahong Yuan, Xingyu Cai, Kenneth Church, Baidu Research, USA, United States
SessionSPE-57: Speech, Depression and Sleepiness
LocationGather.Town
Session Time:Friday, 11 June, 14:00 - 14:45
Presentation Time:Friday, 11 June, 14:00 - 14:45
Presentation Poster
Topic Speech Processing: [SPE-ANLS] Speech Analysis
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract We propose enhancing Transformer language models (BERT, RoBERTa) to take advantage of pauses. Pauses play an important role in speech. In previous work we developed a method to encode pauses in transcripts for recognition of Alzheimer’s disease. In this study, we extend this idea to language models. We re-train BERT and RoBERTa using a large collection of pause-encoded transcripts, and conduct fine-tuning for two downstream tasks, recognition of Alzheimer’s disease and emotion. Pause-encoded language models outperform text-only language models on these tasks. Pause augmentation by duration perturbation for training is shown to improve pause-encoded language models.