2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information
Login Paper Search My Schedule Paper Index Help

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDIVMSP-29.3
Paper Title CGAN-NET: CLASS-GUIDED ASYMMETRIC NON-LOCAL NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION
Authors Hanlin Chen, National University of Defense Technology, China; Qingyong Hu, University of Oxford, United Kingdom; Jungang Yang, Jing Wu, National University of Defense Technology, China; Yulan Guo, National University of Defense Technology, Sun Yat-sen University, China
SessionIVMSP-29: Semantic Segmentation
LocationGather.Town
Session Time:Friday, 11 June, 13:00 - 13:45
Presentation Time:Friday, 11 June, 13:00 - 13:45
Presentation Poster
Topic Image, Video, and Multidimensional Signal Processing: [IVTEC] Image & Video Processing Techniques
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract By introducing various non-local blocks to capture the long-range dependencies, remarkable progress has been achieved in semantic segmentation recently. However, the improvement in segmentation accuracy usually comes at the price of significant reductions in network efficiency, as non-local block usually requires expensive computation and memory for dense pixel-to-pixel correlation. In this paper, we introduce a Class Guided Asymmetric Non-local Network (CGAN-Net) to enhance the class-discriminability in learned feature map, while maintaining real-time efficiency. The key to our approach is to calculate the dense similarity matrix in coarse semantic prediction maps, instead of the high-dimensional latent feature map. This is not only computationally and memory efficient, but helps to learn query-dependent global context. Experiments conducted on Cityscape and CamVid demonstrate the compelling performance of our CGAN-Net. In particular, our network achieves 76.8% mean IoU on the Cityscapes test set with a speed of 38 FPS for 1024x2048 images on a single Tesla V100 GPU.