Unifying Cosine and PLDA Back-ends for Speaker Verification

被引:3
|
作者
Peng, Zhiyuan [1 ,2 ]
He, Xuanji [2 ]
Ding, Ke [2 ]
Lee, Tan [1 ]
Wan, Guanglu [2 ]
机构
[1] Chinese Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[2] Meituan, Beijing, Peoples R China
来源
关键词
speaker verification; cosine; PLDA; dimensional independence;
D O I
10.21437/Interspeech.2022-10021
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
State-of-art speaker verification (SV) systems use a backend model to score the similarity of speaker embeddings extracted from a neural network. The commonly used back-ends are the cosine scoring and the probabilistic linear discriminant analysis (PLDA) scoring. With the recently developed neural embeddings, the theoretically more appealing PLDA approach is found to have no advantage against or even be inferior to the simple cosine scoring in terms of verification performance. This paper presents an investigation on the relation between the two back-ends, aiming to explain the above counter-intuitive observation. It is shown that the cosine scoring is essentially a special case of PLDA scoring. In other words, by properly setting the parameters of PLDA, the two back-ends become equivalent. As a consequence, the cosine scoring not only inherits the basic assumptions for the PLDA but also introduces additional assumptions on speaker embeddings. Experiments show that the dimensional independence assumption required by the cosine scoring contributes most to the performance gap between the two methods under the domain-matched condition. When there is severe domain mismatch, the dimensional independence assumption does not hold and the PLDA would perform better than the cosine for domain adaptation.
引用
收藏
页码:336 / 340
页数:5
相关论文
共 50 条
  • [1] INTRA-CLASS COVARIANCE ADAPTATION IN PLDA BACK-ENDS FOR SPEAKER VERIFICATION
    Madikeri, Srikanth
    Ferras, Marc
    Motlicek, Petr
    Dey, Subhadeep
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5365 - 5369
  • [2] Back-ends
    Fisher, JR
    [J]. SINGLE-DISH RADIO ASTRONOMY: TECHNIQUES AND APPLICATIONS, 2002, 278 : 113 - 122
  • [3] Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?
    Wang, Qiongqiong
    Lee, Kong Aik
    Liu, Tianchi
    [J]. INTERSPEECH 2022, 2022, : 600 - 604
  • [4] When Speaker Recognition Meets Noisy Labels: Optimizations for Front-Ends and Back-Ends
    Li, Lin
    Tong, Fuchuan
    Hong, Qingyang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1586 - 1599
  • [5] Local Training in Speaker Verification for PLDA
    Pahuja, Hunny
    Ranjan, Priya
    Ujlayan, Amit
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 1466 - 1469
  • [6] Exploring ANN Back-Ends for i-Vector Based Speaker Age Estimation
    Fedorova, Anna
    Glembek, Ondrej
    Kinnunen, Tomi
    Matejka, Pavel
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3036 - 3040
  • [7] On the correctness of transformations in compiler back-ends
    Zimmermann, Wolf
    [J]. LEVERAGING APPLICATIONS OF FORMAL METHODS, 2006, 4313 : 74 - 95
  • [8] PLDA Modeling in the Fishervoice Subspace for Speaker Verification
    Zhong, Jinghua
    Jiang, Weiwu
    Rao, Wei
    Mak, Man-Wai
    Meng, Helen
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1130 - 1134
  • [9] PLDA Speaker Verification with Limited Speech Data
    Ridzik, Andrej
    Rusko, Milan
    [J]. SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 325 - 332
  • [10] Fisher Vectors in PLDA Speaker Verification System
    Zajic, Zbynek
    Hruz, Marek
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 1339 - 1342