Phonetic Subspace Mixture Model for Speaker Diarization

被引:0
|
作者
Chen, I-Fan [1 ]
Cheng, Shih-Sian [2 ]
Wang, Hsin-Min [1 ,2 ]
机构
[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 115, Taiwan
[2] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
关键词
BIC; phonetic information; speaker diarization;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents an improved distance measure for speaker clustering in speaker diarization systems. The proposed phonetic subspace mixture (PSM) model introduces phonetic information to the Delta BIC distance measure. Therefore, the new PSM model-based Delta BIC distance measure can remove the effect of phonetic content on the diarization results. The typical Delta BIC distance measure can be seen as a special case of the new Delta BIC distance measure. Our experiment results show that the new distance measurement consistently improves the speaker diarization performance on three datasets.
引用
收藏
页码:2298 / +
页数:2
相关论文
共 50 条
  • [1] Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems
    Zibert, Janez
    Mihelic, France
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1040 - +
  • [2] Analysis of Phonetic Dependence of Segmentation Errors in Speaker Diarization
    McKnight, Simon W.
    Hogg, Aidan O. T.
    Naylor, Patrick A.
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 381 - 385
  • [3] Latent class model with application to speaker diarization
    Liang He
    Xianhong Chen
    Can Xu
    Yi Liu
    Jia Liu
    Michael T. Johnson
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [4] Latent class model with application to speaker diarization
    He, Liang
    Chen, Xianhong
    Xu, Can
    Liu, Yi
    Liu, Jia
    Johnson, Michael T.
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (1)
  • [5] AN INVESTIGATION OF SUBSPACE MODELING FOR PHONETIC AND SPEAKER VARIABILITY IN AUTOMATIC SPEECH RECOGNITION
    Rose, Richard
    Yin, Shou-Chun
    Tang, Yun
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4508 - 4511
  • [6] SPEAKER DIARIZATION WITH LSTM
    Wang, Quan
    Downey, Carlton
    Wan, Li
    Mansfield, Philip Andrew
    Moreno, Ignacio Lopez
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
  • [7] Multimodal Speaker Diarization
    Noulas, Athanasios
    Englebienne, Gwenn
    Krose, Ben J. A.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
  • [8] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [9] Trainable Speaker Diarization
    Aronowitz, Hagai
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024
  • [10] PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification
    Zheng, Siqi
    Suo, Hongbin
    Chen, Qian
    [J]. INTERSPEECH 2022, 2022, : 1431 - 1435