2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDIVMSP-22.4
Paper Title COST AFFINITY LEARNING NETWORK FOR STEREO MATCHING
Authors Shenglun Chen, Dalian University of Technology, China; Baopu Li, Baidu Research, China; Wei Wang, Hong Zhang, Haojie Li, Zhihui Wang, Dalian University of Technology, China
SessionIVMSP-22: Image & Video Sensing, Modeling and Representation
LocationGather.Town
Session Time:Thursday, 10 June, 14:00 - 14:45
Presentation Time:Thursday, 10 June, 14:00 - 14:45
Presentation Poster
Topic Image, Video, and Multidimensional Signal Processing: [IVELI] Electronic Imaging
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Existing stereo matching methods mainly tend to directly aggregate features output from Convolutional Neural Network to obtain more discriminative cost features, but ignore the affinity of each element in the cost feature which also plays a key role in enhancing the cost feature. In this work, we propose a novel cost affinity learning network(CAL-Net) whose Affinity Enhanced Module(AEM) extracts the affinity of the elements in the cost feature and reconstructs a more discriminative feature. In addition, CAL-Net designs a Disparity Weight Loss(DWL) to guide training. Specifically, AEM takes the advatange of the self-attention mechanism to learn internal affinity between different elements and exploits it to reconstruct the cost feature for emphasizing informative elements. DWL calculates the adaptive weight according to disparity error. As the error decreases, the weight gradually increases and enables the network to gradually transit from the pixel level disparity to sub-pixel level. Experiments demonstrate that CAL-Net boosts the performance, especially in textureless and reflective regions, and achieves better results on Scene Flow and KITTI 2012 benchmarks than some typical related methods.