Paper ID | SPE-26.6 |
Paper Title |
Replay-Attack Detection using Features with Adaptive Spectro-Temporal Resolution |
Authors |
Meng Liu, Longbiao Wang, Tianjin University, China; Kong Aik Lee, Agency for Science, Technology and Research (A*STAR), Singapore; Xuanda Chen, Chinese Academy of Social Sciences, China; Jianwu Dang, Japan Advanced Institute of Science and Technology, Japan |
Session | SPE-26: Speaker Verification Spoofing and Countermeasures |
Location | Gather.Town |
Session Time: | Wednesday, 09 June, 15:30 - 16:15 |
Presentation Time: | Wednesday, 09 June, 15:30 - 16:15 |
Presentation |
Poster
|
Topic |
Speech Processing: [SPE-SPKR] Speaker Recognition and Characterization |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
Variable-resolution processing aims to improve the feature representation ability by enlarging the local discriminative details. In previous anti-spoofing studies, phones and frequencies were both proven to be sensitive to replay distortion. In this paper, an adaptive spectro-temporal resolution is proposed to obtain the optimal scale in the feature space: the frequency resolution is adaptive to frequency discrimination, while the temporal resolution is adaptive to continuous phones. In the process, phone-frequency F-ratio analysis is applied to investigate the sensitivity divergences to replay distortion among phones and frequencies. Then, attentive filters are designed to automatically adapt to the phone-frequency discrimination. Validation experiments for the proposed method are conducted on two well-acknowledged magnitude and phase features. A comparative analysis on the ASVspoof 2017 V2.0 database demonstrates that our proposed adaptive spectro-temporal resolution method attains considerably higher error reduction rates than the approaches involving the corresponding original resolution features. |