2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information
Login Paper Search My Schedule Paper Index Help

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDMLSP-17.3
Paper Title Symmetric Sub-graph Spatio-Temporal Graph Convolution and its application in Complex Activity Recognition
Authors Pratyusha Das, Antonio Ortega, University of Southern California, United States
SessionMLSP-17: Graph Neural Networks
LocationGather.Town
Session Time:Wednesday, 09 June, 14:00 - 14:45
Presentation Time:Wednesday, 09 June, 14:00 - 14:45
Presentation Poster
Topic Machine Learning for Signal Processing: [MLR-DEEP] Deep learning techniques
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Understanding complex hand actions from hand skeleton data is an important yet challenging task. In this paper, we analyze hand skeleton-based complex activities by modeling dynamic hand skeletons through a spatio-temporal graph convolutional network (ST-GCN). This model jointly learns and extracts Spatio-temporal features for activity recognition. Our proposed technique, Symmetric Sub-graph spatio-temporal graph convolutional neural network (S^2-ST-GCN), exploits the symmetric nature of hand graphs to decompose them into sub-graphs, which allow us to build a separate temporal model for the relative motion of the fingers. This subgraph approach can be implemented efficiently by preprocessing input data using a Haar unit based orthogonal matrix. Then, in addition to spatial filters, separate temporal filters can be learned for each sub-graph. We evaluate the performance of the proposed method on the First-Person Hand Action dataset. While the proposed method shows comparable performance with the state of the art methods in train:test=1:1 setting, it achieves this with greater stability. Furthermore, we demonstrate significant performance improvement in comparison to state of the art methods in the cross-person setting. S^2-ST-GCN also outperforms a finger-based decomposition of the hand graph where no preprocessing is applied.