Paper ID | SPE-26.3 |
Paper Title |
A CAPSULE NETWORK BASED APPROACH FOR DETECTION OF AUDIO SPOOFING ATTACKS |
Authors |
Anwei Luo, Enlei Li, Sun Yat-sen University, China; Yongliang Liu, Alibaba Group, China; Xiangui Kang, Sun Yat-sen University, China; Z. Jane Wang, University of British Columbia, Canada |
Session | SPE-26: Speaker Verification Spoofing and Countermeasures |
Location | Gather.Town |
Session Time: | Wednesday, 09 June, 15:30 - 16:15 |
Presentation Time: | Wednesday, 09 June, 15:30 - 16:15 |
Presentation |
Poster
|
Topic |
Speech Processing: [SPE-SPKR] Speaker Recognition and Characterization |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
Audio spoofing attacks not only increasingly pose a threat to automatic speaker verification systems but also have the potential to destabilize national security (e.g., by creating fake audio of influential politicians). The main purpose of anti-spoofing is to detect fake audios synthesized by advanced methods, while current algorithms using convolutional neural networks as classifiers exposed poor generalization to the unknown attacks. In this paper, as the first attempt, we introduce a capsule network to enhance the generalization of the detection system. To make the capsule network suitable for anti-spoofing tasks, we modified the original dynamic routing algorithm to force the model to pay more attention to artifacts and thus yield better detection performance for text-to-speech/voice conversion attacks. Furthermore, replay attack detection is also investigated, and the results indicate that our proposed approach is also highly capable of detecting replay attacks. |