Paper ID | SS-10.5 | ||
Paper Title | Context-Aware Speech Stress Detection in Hospital Workers Using Bi-LSTM Classifiers | ||
Authors | Amr Gaballah, Abhishek Tiwari, Institut national de la recherche scientifique, Canada; Shrikanth Narayanan, University of Southern California, United States; Tiago Falk, Institut national de la recherche scientifique, Canada | ||
Session | SS-10: Computer Audition for Healthcare (CA4H) | ||
Location | Gather.Town | ||
Session Time: | Thursday, 10 June, 13:00 - 13:45 | ||
Presentation Time: | Thursday, 10 June, 13:00 - 13:45 | ||
Presentation | Poster | ||
Topic | Special Sessions: Computer Audition for Healthcare (CA4H) | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Hospital workers are known to work long hours in a highly stressful environment. The COVID-19 pandemic has increased this burden multi-fold. Pre-COVID statistics already showed that one in every three nurses reported burnout, thus affecting patient satisfaction and the quality of their provided service. Real-time monitoring of burnout, and other underlying factors, such as stress, could provide feedback not only to the clinical staff, but also to hospital administrators, thus allowing for supportive measures to be taken early. In this paper, we present a context-aware speech-based system for stress detection. We consider data from 144 hospital workers who were monitored during their daily shifts over a 10-week period; subjective stress readings were collected daily. Wearable devices measured speech features and physiological readings, such as heart rate. Environment sensors, in turn, were used to track staff movement within the hospital. Here, we show the importance of context-awareness for stress level detection based on a bidirectional LSTM deep neural network. In particular, we show the importance of hospital location and circadian rhythm based contextual cues for stress prediction. Overall, we show improvements as high as 14\% in F1 scores once context is incorporated, relative to using the speech features alone. |