2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information
Login Paper Search My Schedule Paper Index Help

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDIVMSP-18.2
Paper Title EFFICIENT FACE MANIPULATION VIA DEEP FEATURE DISENTANGLEMENT AND REINTEGRATION NET
Authors Bin Cheng, Tao Dai, Bin Chen, Shutao Xia, Tsinghua University, Peng Cheng Laboratory, China; Xiu Li, Tsinghua University, China
SessionIVMSP-18: Faces in Images & Videos
LocationGather.Town
Session Time:Wednesday, 09 June, 16:30 - 17:15
Presentation Time:Wednesday, 09 June, 16:30 - 17:15
Presentation Poster
Topic Image, Video, and Multidimensional Signal Processing: [IVARS] Image & Video Analysis, Synthesis, and Retrieval
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Deep neural networks (DNNs) have been widely used in facial manipulation. Existing methods focus on training deeper networks in indirect supervision ways (e.g., feature constraint), or in unsupervised ways (e.g., cycle-consistency loss) due to the lack of ground-truth face images for manipulated outputs. However, such methods can not synthesize realistic face images well and suffer from very high training overhead. To address this issue, we propose a novel Feature Disentanglement and Reintegraion network (FDRNet), which employs ground-truth images as informative supervision and dynamically adapts the fusion of informative features of the ground-truth images in a self-supervised way. FDRNet consists of a Feature Disentanglement (FD) Network and a FeatureReintegration (FR) Network, which encodes informative disentangled representations from the ground-truth images and fuses the disentangled representations to reconstruct the face images. By learning disentangled representations, our method can generate plausible faces conditioned on both landmarks and identities, which can be used for a variety of face manipulation tasks. Experiments on the CelebA-HQ and FFHQ datasets are conducted to demonstrate the superiority of our method over state-of-the-art methods in terms of effectiveness and efficiency.