VARIATIONAL BAYES BASED I-VECTOR FOR SPEAKER DIARIZATION OF TELEPHONE CONVERSATIONS

被引:0
|
作者
Zheng, Rong [1 ]
Zhang, Ce [1 ]
Zhang, Shanshan [1 ]
Xu, Bo [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Interact Digital Media Technol Res Ctr, Beijing, Peoples R China
关键词
speaker diarization; eigenvoices; I-vector; total variability; variational Bayes;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we investigate the variational Bayes based I-vector method for speaker diarization of telephone conversations. The motivation of the proposed algorithm is to utilize variational Bayesian framework and exploit potential channel effect of total variability modeling for diarization of conversation side. Other three well-known techniques are compared as follows: K-means clustering for eigenvoices and I-vector speaker diarization, and variational Bayes applied to eigenvoices. Performance evaluations are conducted on the summed-channel telephone data from the 2008 NIST speaker recognition evaluation. The paper discusses how the performance is influenced by different modules, e. g., VAD, initial speaker clustering and Viterbi re-segmentation. Comparison experiments show the interest of variational Bayesian probabilistic framework for speaker diarization.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Investigation of Segmentation in i-Vector Based Speaker Diarization of Telephone Speech
    Zajic, Zbynek
    Kunesova, Marie
    Radova, Vlasta
    [J]. SPEECH AND COMPUTER, 2016, 9811 : 411 - 418
  • [2] Improved i-Vector Representation for Speaker Diarization
    Xu, Yan
    McLoughlin, Ian
    Song, Yan
    Wu, Kui
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2016, 35 (09) : 3393 - 3404
  • [3] Improved i-Vector Representation for Speaker Diarization
    Yan Xu
    Ian McLoughlin
    Yan Song
    Kui Wu
    [J]. Circuits, Systems, and Signal Processing, 2016, 35 : 3393 - 3404
  • [4] I-vector similarity based speech segmentation for interested speaker to speaker diarization system
    Bae, Ara
    Yoon, Ki-mu
    Jung, Jaehee
    Chung, Bokyung
    Kim, Wooil
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2020, 39 (05): : 461 - 467
  • [5] ONLINE SPEAKER DIARIZATION USING ADAPTED I-VECTOR TRANSFORMS
    Zhu, Weizhong
    Pelecanos, Jason
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5045 - 5049
  • [6] SPEAKER DIARIZATION WITH PLDA I-VECTOR SCORING AND UNSUPERVISED CALIBRATION
    Sell, Gregory
    Garcia-Romero, Daniel
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 413 - 417
  • [7] Mahalanobis Based Emission Model for Speaker Diarization of Telephone Conversations
    Furmanov, Tal
    Aminov, Lidiya
    Moyal, Ami
    Lapidot, Itshak
    [J]. 2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI), 2014,
  • [8] Integrating Online I-vector extractor with Information Bottleneck based Speaker Diarization system
    Madikeri, Srikanth
    Himawan, Ivan
    Motlicek, Petr
    Ferras, Marc
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3105 - 3109
  • [9] Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations
    Ben-Harush, Oshry
    Ben-Harush, Ortal
    Lapidot, Itshak
    Guterman, Hugo
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 414 - 425
  • [10] Full-Posterior PLDA based Speaker Diarization of telephone conversations
    Chen, Yanni
    Yan, Yonghong
    Hong, Wei
    Guan, Songzan
    [J]. PROCEEDINGS FIRST INTERNATIONAL CONFERENCE ON ELECTRONICS INSTRUMENTATION & INFORMATION SYSTEMS (EIIS 2017), 2017, : 840 - 844