Paper ID | BIO-3.6 |
Paper Title |
INTERPRETING GLOTTAL FLOW DYNAMICS FOR DETECTING COVID-19 FROM VOICE |
Authors |
Soham Deshmukh, Mahmoud Al Ismail, Rita Singh, Carnegie Mellon University, United States |
Session | BIO-3: Machine Learning for COVID-19 diagnosis |
Location | Gather.Town |
Session Time: | Tuesday, 08 June, 13:00 - 13:45 |
Presentation Time: | Tuesday, 08 June, 13:00 - 13:45 |
Presentation |
Poster
|
Topic |
Biomedical Imaging and Signal Processing: [BIO-MIA] Medical image analysis |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
In the pathogenesis of COVID-19, impairment of respiratory functions is often one of the key symptoms. Studies show that in these cases, voice production is also adversely affected -- vocal fold oscillations are asynchronous, asymmetrical and more restricted during phonation. This paper proposes a method that analyzes the differential dynamics of the glottal flow waveform (GFW) during voice production to identify features in them that are most significant for the detection of COVID-19 from voice. Since it is hard to measure this directly in COVID-19 patients, we infer it from recorded speech signals and compare it to the GFW computed from physical model of phonation. For normal voices, the difference between the two should be minimal, since physical models are constructed to explain phonation under assumptions of normalcy. Greater differences implicate anomalies in the bio-physical factors that contribute to the correctness of the physical model, revealing their significance indirectly. Our proposed method uses a CNN-based 2-step attention model that locates anomalies in time-feature space in the difference of the two GFWs, allowing us to infer their potential as discriminative features for classification. The viability of this method is demonstrated using a clinically curated dataset of COVID-19 positive and negative subjects. |