Paper ID | SPE-8.1 | ||
Paper Title | SQUEEZING VALUE OF CROSS-DOMAIN LABELS: A DECOUPLED SCORING APPROACH FOR SPEAKER VERIFICATION | ||
Authors | Lantian Li, Yang Zhang, Jiawen Kang, Thomas Fang Zheng, Dong Wang, Tsinghua University, China | ||
Session | SPE-8: Speaker Recognition 2: Channel and Domain Robustness | ||
Location | Gather.Town | ||
Session Time: | Tuesday, 08 June, 14:00 - 14:45 | ||
Presentation Time: | Tuesday, 08 June, 14:00 - 14:45 | ||
Presentation | Poster | ||
Topic | Speech Processing: [SPE-SPKR] Speaker Recognition and Characterization | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Domain mismatch is often occurred in real applications and causes serious performance reduction on speaker recognition systems. The common wisdom is to collect cross-domain data and train a multi-domain PLDA model, with the hope to learn a domain-independent speaker subspace. In this paper, we firstly present an empirical study to show that simply adding cross-domain data does not help performance in conditions with enroll-test mismatch. Careful analysis shows that this striking result is caused by the incoherent statistics between enroll and test conditions. Based on this analysis, we present a decoupled scoring approach that can maximally squeeze the value of cross-domain labels and obtain optimal verification scores in the enrollment-test mismatch condition. When the statistics are coherent, the new formulation falls back to the conventional PLDA. Experimental results on cross-channel test show that the proposed approach is highly effective and is a principal solution to domain mismatch. |