Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?

被引:0
|
作者
Wang, Qiongqiong [1 ]
Lee, Kong Aik [1 ]
Liu, Tianchi [1 ,2 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
来源
关键词
speaker verification; large-margin softmax; cosine similarity; PLDA; ECAPA-TDNN;
D O I
10.21437/Interspeech.2022-10055
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The emergence of large-margin softmax cross-entropy losses in training deep speaker embedding neural networks has triggered a gradual shift from parametric back-ends to a simpler cosine similarity measure for speaker verification. Popular parametric back-ends include the probabilistic linear discriminant analysis (PLDA) and its variants. This paper investigates the properties of margin-based cross-entropy losses leading to such a shift, and aims to find scoring back-ends best suited for speaker verification. In addition, we revisit the pre-processing techniques which have been widely used in the past and assess their effectiveness on large-margin embeddings. Experiments on the state-of-the-art ECAPA-TDNN networks trained with various large-margin softmax cross-entropy losses show a substantial increment in intra-speaker compactness making the conventional PLDA superfluous. In this regard, we found that constraining the within-speaker covariance matrix could improve the performance of the PLDA. It is demonstrated through a series of experiments on the VoxCeleb-1 and SITW core-core test sets with 40.8% equal error rate (EER) reduction and 35.1% minimum detection cost (minDCF) reduction. It also outperforms cosine scoring consistently with reductions in EER and minDCF by 10.9% and 4.9%, respectively.
引用
收藏
页码:600 / 604
页数:5
相关论文
共 50 条
  • [31] VARIABILITY REGULARIZATION IN LARGE-MARGIN CLASSIFICATION
    Mansjur, Dwi Sianto
    Wada, Ted S.
    Juang, Biing-Hwang
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 1956 - 1959
  • [32] Large-Margin Convex Polytope Machine
    Kantchelian, Alex
    Tschantz, Michael Carl
    Huang, Ling
    Bartlett, Peter L.
    Joseph, Anthony D.
    Tygar, J. D.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [33] Large-Margin Determinantal Point Processes
    Chao, Wei-Lun
    Gong, Boqing
    Grauman, Kristen
    Sha, Fei
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2015, : 191 - 200
  • [34] SNR-Invariant PLDA Modeling for Robust Speaker Verification
    Li, Na
    Mak, Man-Wai
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2317 - 2321
  • [35] Unsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification
    Wang, Qiongqiong
    Koshinaka, Takafumi
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3727 - 3731
  • [36] CHANNEL ADAPTATION OF PLDA FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Guo, Wu
    Li, Haizhou
    Dai, Li Rong
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5251 - 5255
  • [37] Non-speaker information reduction from Cosine Similarity Scoring in i-vector based speaker verification
    Zeinali, Hossein
    Mirian, Alireza
    Sameti, Hossein
    BabaAli, Bagher
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2015, 48 : 226 - 238
  • [38] Cosine Scoring With Uncertainty for Neural Speaker Embedding
    Wang, Qiongqiong
    Lee, Kong Aik
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 845 - 849
  • [39] Duration Dependent Covariance Regularization in PLDA Modeling for Speaker Verification
    Cai, Weicheng
    Li, Ming
    Li, Lin
    Hong, Qingyang
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1027 - 1031
  • [40] A TRANSFER LEARNING METHOD FOR PLDA-BASED SPEAKER VERIFICATION
    Hong, Qingyang
    Zhang, Jun
    Li, Lin
    Wan, Lihong
    Tong, Feng
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5455 - 5459