PLDA-BASED DIARIZATION OF TELEPHONE CONVERSATIONS

被引:0
|
作者
Bulut, Ahmet Emin [1 ,2 ]
Demir, Hakan [1 ]
Isik, Yusuf Ziya [1 ,2 ]
Erdogan, Hakan [2 ]
机构
[1] TUBITAK BILGEM, Gebze, Turkey
[2] Sabanci Univ, Fac Engn & Nat Sci, Istanbul, Turkey
关键词
speaker diarization; i-vector; PLDA; deterministic annealing; variational Bayes;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper investigates the application of the probabilistic linear discriminant analysis (PLDA) to speaker diarization of telephone conversations. We introduce using a variational Bayes (VB) approach for inference under a PLDA model for modelling segmental i-vectors in speaker diarization. Deterministic annealing (DA) algorithm is imposed in order to avoid local optimal solutions in VB iterations. We compare our proposed system with a well-known system that applies k-means clustering on principal component analysis (PCA) coefficients of segmental i-vectors. We used summed channel telephone data from the National Institute of Standards and Technology (NIST) 2008 Speaker Recognition Evaluation (SRE) as the test set in order to evaluate the performance of the proposed system. We achieve about 20% relative improvement in Diarization Error Rate (DER) compared to the baseline system.
引用
收藏
页码:4809 / 4813
页数:5
相关论文
共 50 条
  • [1] Full-Posterior PLDA based Speaker Diarization of telephone conversations
    Chen, Yanni
    Yan, Yonghong
    Hong, Wei
    Guan, Songzan
    [J]. PROCEEDINGS FIRST INTERNATIONAL CONFERENCE ON ELECTRONICS INSTRUMENTATION & INFORMATION SYSTEMS (EIIS 2017), 2017, : 840 - 844
  • [2] PLDA-based Clustering for Speaker Diarization of Broadcast Streams
    Silovsky, Jan
    Prazak, Jan
    Cerva, Petr
    Zdansky, Jindrich
    Nouza, Jan
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2920 - +
  • [3] Online Diarization of Telephone Conversations
    Ben-Harush, Oshry
    Lapidot, Itshak
    Guterman, Hugo
    [J]. ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 125 - 130
  • [4] Incremental Diarization of Telephone Conversations
    Ben-Harush, Oshiy
    Lapidot, Itshak
    Guterman, Hugo
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2226 - +
  • [5] Transfer learning for PLDA-based speaker verification
    Hong, Qingyang
    Li, Lin
    Zhang, Jun
    Wan, Lihong
    Guo, Huiyang
    [J]. SPEECH COMMUNICATION, 2017, 92 : 90 - 99
  • [6] DIFFUSION MAPS FOR PLDA-BASED SPEAKER VERIFICATION
    Barkan, Oren
    Aronowitz, Hagai
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7639 - 7643
  • [7] Mahalanobis Based Emission Model for Speaker Diarization of Telephone Conversations
    Furmanov, Tal
    Aminov, Lidiya
    Moyal, Ami
    Lapidot, Itshak
    [J]. 2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI), 2014,
  • [8] Diarization of Telephone Conversations Using Factor Analysis
    Kenny, Patrick
    Reynolds, Douglas
    Castaldo, Fabio
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (06) : 1059 - 1070
  • [9] Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations
    Ben-Harush, Oshry
    Ben-Harush, Ortal
    Lapidot, Itshak
    Guterman, Hugo
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 414 - 425
  • [10] A TRANSFER LEARNING METHOD FOR PLDA-BASED SPEAKER VERIFICATION
    Hong, Qingyang
    Zhang, Jun
    Li, Lin
    Wan, Lihong
    Tong, Feng
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5455 - 5459