Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?

被引:0
|
作者
Wang, Qiongqiong [1 ]
Lee, Kong Aik [1 ]
Liu, Tianchi [1 ,2 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
[2] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
来源
关键词
speaker verification; large-margin softmax; cosine similarity; PLDA; ECAPA-TDNN;
D O I
10.21437/Interspeech.2022-10055
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The emergence of large-margin softmax cross-entropy losses in training deep speaker embedding neural networks has triggered a gradual shift from parametric back-ends to a simpler cosine similarity measure for speaker verification. Popular parametric back-ends include the probabilistic linear discriminant analysis (PLDA) and its variants. This paper investigates the properties of margin-based cross-entropy losses leading to such a shift, and aims to find scoring back-ends best suited for speaker verification. In addition, we revisit the pre-processing techniques which have been widely used in the past and assess their effectiveness on large-margin embeddings. Experiments on the state-of-the-art ECAPA-TDNN networks trained with various large-margin softmax cross-entropy losses show a substantial increment in intra-speaker compactness making the conventional PLDA superfluous. In this regard, we found that constraining the within-speaker covariance matrix could improve the performance of the PLDA. It is demonstrated through a series of experiments on the VoxCeleb-1 and SITW core-core test sets with 40.8% equal error rate (EER) reduction and 35.1% minimum detection cost (minDCF) reduction. It also outperforms cosine scoring consistently with reductions in EER and minDCF by 10.9% and 4.9%, respectively.
引用
收藏
页码:600 / 604
页数:5
相关论文
共 50 条
  • [1] Unifying Cosine and PLDA Back-ends for Speaker Verification
    Peng, Zhiyuan
    He, Xuanji
    Ding, Ke
    Lee, Tan
    Wan, Guanglu
    [J]. INTERSPEECH 2022, 2022, : 336 - 340
  • [2] Fast Scoring for Mixture of PLDA in I-Vector/PLDA Speaker Verification
    Mak, Man-Wai
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 587 - 593
  • [3] IMPROVED LARGE-MARGIN SOFTMAX LOSS FOR SPEAKER DIARISATION
    Fathullah, Y.
    Zhang, C.
    Woodland, P. C.
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7104 - 7108
  • [4] BOUNDARY DISCRIMINATIVE LARGE MARGIN COSINE LOSS FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Li, Rongjin
    Li, Na
    Tuo, Deyi
    Yu, Meng
    Su, Dan
    Yu, Dong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6321 - 6325
  • [5] Local Training in Speaker Verification for PLDA
    Pahuja, Hunny
    Ranjan, Priya
    Ujlayan, Amit
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 1466 - 1469
  • [6] Large Margin Softmax Loss for Speaker Verification
    Liu, Yi
    He, Liang
    Liu, Jia
    [J]. INTERSPEECH 2019, 2019, : 2873 - 2877
  • [7] PLDA Modeling in the Fishervoice Subspace for Speaker Verification
    Zhong, Jinghua
    Jiang, Weiwu
    Rao, Wei
    Mak, Man-Wai
    Meng, Helen
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1130 - 1134
  • [8] PLDA Speaker Verification with Limited Speech Data
    Ridzik, Andrej
    Rusko, Milan
    [J]. SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 325 - 332
  • [9] Fisher Vectors in PLDA Speaker Verification System
    Zajic, Zbynek
    Hruz, Marek
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 1339 - 1342
  • [10] On Deep Speaker Embeddings for Speaker Verification
    Jakubec, Maros
    Jarina, Roman
    Lieskovska, Eva
    Chmulik, Michal
    [J]. 2021 44TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2021, : 162 - 166