VARIATIONAL BAYES BASED I-VECTOR FOR SPEAKER DIARIZATION OF TELEPHONE CONVERSATIONS

被引:0
|
作者
Zheng, Rong [1 ]
Zhang, Ce [1 ]
Zhang, Shanshan [1 ]
Xu, Bo [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Interact Digital Media Technol Res Ctr, Beijing, Peoples R China
关键词
speaker diarization; eigenvoices; I-vector; total variability; variational Bayes;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we investigate the variational Bayes based I-vector method for speaker diarization of telephone conversations. The motivation of the proposed algorithm is to utilize variational Bayesian framework and exploit potential channel effect of total variability modeling for diarization of conversation side. Other three well-known techniques are compared as follows: K-means clustering for eigenvoices and I-vector speaker diarization, and variational Bayes applied to eigenvoices. Performance evaluations are conducted on the summed-channel telephone data from the 2008 NIST speaker recognition evaluation. The paper discusses how the performance is influenced by different modules, e. g., VAD, initial speaker clustering and Viterbi re-segmentation. Comparison experiments show the interest of variational Bayesian probabilistic framework for speaker diarization.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] I-vector Based Text-Independent Speaker Identification
    Liu, Tingting
    Kang, Kai
    Guan, Shengxiao
    [J]. 2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 5420 - 5425
  • [32] SUPERVISED DOMAIN ADAPTATION FOR I-VECTOR BASED SPEAKER RECOGNITION
    Garcia-Romero, Daniel
    McCree, Alan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [33] Simplification of I-Vector Extraction for Speaker Identification
    XU Longting
    YANG Zhen
    SUN Linhui
    [J]. Chinese Journal of Electronics, 2016, 25 (06) : 1121 - 1126
  • [34] Simplification of I-Vector Extraction for Speaker Identification
    Xu Longting
    Yang Zhen
    Sun Linhui
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2016, 25 (06) : 1121 - 1126
  • [35] Combining gaussianized/non-gaussianized features to improve speaker diarization of telephone conversations
    Gupta, Vishwa
    Kenny, Patrick
    Ouellet, Pierre
    Boulianne, Gilles
    Dumouchel, Pierre
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (12) : 1040 - 1043
  • [36] Full multicondition training for robust i-vector based speaker recognition
    Ribas, Dayana
    Vincent, Emmanuel
    Ramon Calvo, Jose
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1057 - 1061
  • [37] Neural Networks based Channel Compensation for I-Vector Speaker Verification
    Rao, Wei
    Xiao, Xiong
    Xu, Chenglin
    Xu, Haihua
    Lee, Kong Aik
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [38] Speaker Recognition Based on i-Vector and Improved Local Preserving Projection
    Wu, Di
    [J]. PROCEEDINGS OF THE 2015 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT INFORMATION PROCESSING, 2015, 336 : 115 - 121
  • [39] Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition
    Cumani, Sandro
    Laface, Pietro
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 908 - 919
  • [40] Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition
    Wang, Shuai
    Huang, Zili
    Qian, Yanmin
    Yu, Kai
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 195 - 199