IEEE ICASSP 2021 || Toronto, Ontario, Canada || 6-11 June 2021

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper ID

MMSP-6.5

Paper Title

IMPROVING MULTIMODAL SPEECH ENHANCEMENT BY INCORPORATING SELF-SUPERVISED AND CURRICULUM LEARNING

Authors

Ying Cheng, Mengyu He, Jiashuo Yu, Rui Feng, Fudan University, China

Session

MMSP-6: Human Centric Multimedia 2

Location

Gather.Town

Session Time:

Thursday, 10 June, 14:00 - 14:45

Presentation Time:

Thursday, 10 June, 14:00 - 14:45

Presentation

Poster

Topic

Multimedia Signal Processing: Signal Processing for Multimedia Applications

IEEE Xplore Open Preview

Click here to view in IEEE Xplore

Abstract

Speech enhancement in realistic scenarios still remains many challenges, such as complex background signals and data limitations. In this paper, we present a co-attention based framework that incorporates self-supervised and curriculum learning to derive the target speech in noisy environments. Specifically, we first leverage self-supervision to pre-train the co-attention model on the task of audio-visual synchronization. The pre-trained model can focus on the lip of speakers automatically, and then the self-supervised features from the model are combined with a u-net regression network to separate the spectrograms of sound mixtures. To make the training process easier and further improve the performance, we introduce the curriculum learning scheme for the training stage of speech enhancement. Extensive experiments show that our model achieves superior performance over previous self-supervised method for speech enhancement, and demonstrate the generalizability of our approach to the transferred dataset.

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

My ICASSP 2021 Schedule

Paper Detail