Paper ID | SS-15.6 |
Paper Title |
TEACHER-STUDENT LEARNING WITH MULTI-GRANULARITY CONSTRAINT TOWARDS COMPACT FACIAL FEATURE REPRESENTATION |
Authors |
Shurun Wang, Shiqi Wang, Wenhan Yang, City University of Hong Kong, China; Xinfeng Zhang, University of Chinese Academy of Sciences, China; Shanshe Wang, Siwei Ma, Peking University, China |
Session | SS-15: Signal Processing for Collaborative Intelligence |
Location | Gather.Town |
Session Time: | Friday, 11 June, 13:00 - 13:45 |
Presentation Time: | Friday, 11 June, 13:00 - 13:45 |
Presentation |
Poster
|
Topic |
Special Sessions: Signal Processing for Collaborative Intelligence |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
In this paper, we propose a novel end-to-end feature compression scheme by leveraging the representation and learning capability of deep neural networks, towards intelligent front-end equipped analysis with promising accuracy and efficiency. In particular, the extracted features are compactly coded in an end-to-end manner by optimizing the rate-distortion cost to achieve feature-in-feature representation. The multi-granularity constraint is further imposed, serving as the optimization objective to make the feature compression more ``healthier'' from the perspective of ultimate utility. More specifically, the analysis accuracy is considered in the coarse granularity level constraint, ensuring the capability of facial analysis with the reconstructed feature. Furthermore, at the fine granularity level the feature fidelity is involved to preserve the original feature quality. Moreover, a latent code level teacher-student enhancement model is proposed to efficiently transfer the low bit-rate representation into a high bit-rate one. Such a strategy further allows us to adaptively shift the representation cost to decoding computations, leading to more flexible feature compression with enhanced decoding capability. We verify the effectiveness of the proposed model with the facial feature, and experimental results reveal better compression performance in terms of rate-accuracy compared with existing models. |