PLDA-BASED DIARIZATION OF TELEPHONE CONVERSATIONS

被引:0
|
作者
Bulut, Ahmet Emin [1 ,2 ]
Demir, Hakan [1 ]
Isik, Yusuf Ziya [1 ,2 ]
Erdogan, Hakan [2 ]
机构
[1] TUBITAK BILGEM, Gebze, Turkey
[2] Sabanci Univ, Fac Engn & Nat Sci, Istanbul, Turkey
关键词
speaker diarization; i-vector; PLDA; deterministic annealing; variational Bayes;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper investigates the application of the probabilistic linear discriminant analysis (PLDA) to speaker diarization of telephone conversations. We introduce using a variational Bayes (VB) approach for inference under a PLDA model for modelling segmental i-vectors in speaker diarization. Deterministic annealing (DA) algorithm is imposed in order to avoid local optimal solutions in VB iterations. We compare our proposed system with a well-known system that applies k-means clustering on principal component analysis (PCA) coefficients of segmental i-vectors. We used summed channel telephone data from the National Institute of Standards and Technology (NIST) 2008 Speaker Recognition Evaluation (SRE) as the test set in order to evaluate the performance of the proposed system. We achieve about 20% relative improvement in Diarization Error Rate (DER) compared to the baseline system.
引用
收藏
页码:4809 / 4813
页数:5
相关论文
共 50 条
  • [21] DOMAIN ADAPTATION USING MAXIMUM LIKELIHOOD LINEAR TRANSFORMATION FOR PLDA-BASED SPEAKER VERIFICATION
    Wang, Qiongqiong
    Yamamoto, Hitoshi
    Koshinaka, Takafumi
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5110 - 5114
  • [22] Unsupervised adaptation of PLDA models for broadcast diarization
    Vinals, Ignacio
    Ortega, Alfonso
    Villalba, Jesus
    Miguel, Antonio
    Lleida, Eduardo
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)
  • [23] Combining gaussianized/non-gaussianized features to improve speaker diarization of telephone conversations
    Gupta, Vishwa
    Kenny, Patrick
    Ouellet, Pierre
    Boulianne, Gilles
    Dumouchel, Pierre
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (12) : 1040 - 1043
  • [24] PLDA-based Speaker Verification in Multi-Enrollment Scenario using Expected Vector Approach
    Soni, Meet
    Panda, Ashish
    [J]. 2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [25] VARIATIONAL BAYESIAN PLDA FOR SPEAKER DIARIZATION IN THE MGB CHALLENGE
    Villalba, Jesus
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 667 - 674
  • [26] Multi-PLDA Diarization on Children's Speech
    Xie, Jiamin
    Garcia-Perera, Leibny Paola
    Povey, Daniel
    Khudanpur, Sanjeev
    [J]. INTERSPEECH 2019, 2019, : 376 - 380
  • [27] SPEAKER DIARIZATION WITH PLDA I-VECTOR SCORING AND UNSUPERVISED CALIBRATION
    Sell, Gregory
    Garcia-Romero, Daniel
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 413 - 417
  • [28] 'TELEPHONE CONVERSATIONS'
    BROOKS, G
    [J]. BLACK AMERICAN LITERATURE FORUM, 1983, 17 (04): : 148 - 148
  • [29] TELEPHONE CONVERSATIONS
    Brooks, Gwendolyn
    [J]. AFRICAN AMERICAN REVIEW, 2017, 50 (04) : 376 - 376
  • [30] Estimation of the Number of Speakers with Variational Bayesian PLDA in the DIHARD Diarization Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2803 - 2807