Paper ID | BIO-10.1 |
Paper Title |
HIERARCHICAL ATTENTION-BASED TEMPORAL CONVOLUTIONAL NETWORKS FOR EEG-BASED EMOTION RECOGNITION |
Authors |
Chao Li, Boyang Chen, Ziping Zhao, Tianjin Normal University, China; Nicholas Cummins, King’s College London, United Kingdom; Björn Schuller, University of Augsburg, Germany |
Session | BIO-10: Deep Learning for EEG Analysis |
Location | Gather.Town |
Session Time: | Thursday, 10 June, 13:00 - 13:45 |
Presentation Time: | Thursday, 10 June, 13:00 - 13:45 |
Presentation |
Poster
|
Topic |
Biomedical Imaging and Signal Processing: [BIO] Biomedical signal processing |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
EEG-based emotion recognition is an effective way to infer the inner emotional state of human beings. Recently, deep learning methods, particularly long short-term memory recurrent neural networks (LSTM-RNNs), have made encouraging progress for in the field of emotion recognition. However, the LSTM-RNNs are time-consuming and have difficulty avoiding the problem of exploding/vanishing gradients when during training. In addition, EEG-based emotion recognition often suffers due to the existence of silent and emotional irrelevant frames from intra-channel. Not all channels carry the same emotional discriminative information. In order to tackle these problems, a hierarchical attention-based temporal convolutional networks (HATCN) for efficient EEG-based emotion recognition is proposed. Firstly, a spectrogram representation is generated from raw EEG signals in each channel to capture their time and frequency information. Secondly, temporal convolutional networks (TCNs) are utilised to automatically learn more robust/intrinsic long-term dynamic characters in emotion response. Next, a hierarchical attention mechanism is investigated that aggregates the emotional information at both the frame and channel level. The experimental results on the DEAP dataset show that our method achieves an average recognition accuracy of 0.716 and an F1-score of 0.642 over four emotional dimensions and outperforms other state-of-the-art methods in a user-independent scenario. |