Improving Speaker Segmentation via Speaker Identification and Text Segmentation

被引:0
|
作者
Li, Runxin [1 ]
Schultz, Tanja [1 ]
Jin, Qin [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, InterACT, Pittsburgh, PA 15213 USA
关键词
speaker diarization; speaker segmentation; speaker identification; text segmentation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speaker segmentation is an essential part of a speaker diarization system. Common segmentation systems usually miss speaker change points when speakers switch fast. These errors seriously confuse the following speaker clustering step and result in high overall speaker diarization error rates. In this paper two methods are proposed to deal with this problem: The first approach uses speaker identification techniques to boost speaker segmentation. And the second approach applies text segmentation methods to improve the performance of speaker segmentation. Experiments on Quaero speaker diarization evaluation data shows that our methods achieve up to 45% relative reduction in the speaker diarization error and 64% relative increase in the speaker change detection recall rate over the baseline system. Moreover, both these two approaches can be considered as post-processing steps over the baseline segmentation, therefore, they can be applied in any speaker diarization systems.
引用
收藏
页码:928 / 931
页数:4
相关论文
共 50 条
  • [1] Speaker segmentation and clustering
    Kotti, Margarita
    Moschou, Vassiliki
    Kotropoulos, Constantine
    [J]. SIGNAL PROCESSING, 2008, 88 (05) : 1091 - 1124
  • [2] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Wang, D.
    Vogt, R.
    Sridharan, S.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
  • [3] Factor Analysis for Speaker Segmentation and Improved Speaker Diarization
    Desplanques, Brecht
    Demuynck, Kris
    Martens, Jean-Pierre
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3081 - 3085
  • [4] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Speech and Audio Research Laboratory, Queensland University of Technology, Brisbane, Australia
    [J]. Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH, (1405-1408):
  • [5] Joint Speaker Segmentation, Localization and Identification for Streaming Audio
    Schmalenstroeer, Joerg
    Haeb-Umbach, Reinhold
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 453 - 456
  • [6] Confidence Measures for Speaker Segmentation and their Relation to Speaker Verification
    Vaquero, Carlos
    Ortega, Alfonso
    Villalba, Jesus
    Miguel, Antonio
    Lleida, Eduardo
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2310 - 2313
  • [7] SPEAKER SEGMENTATION USING DEEP SPEAKER VECTORS FOR FAST SPEAKER CHANGE SCENARIOS
    Wang, Renyu
    Gu, Mingliang
    Li, Lantian
    Xu, Mingxing
    Zheng, Thoms Fang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5420 - 5424
  • [8] Using Phoneme Recognition and Text-dependent Speaker Verification to Improve Speaker Segmentation for Chinese Speech
    Wang, Gang
    Wu, Xiaojun
    Zheng, Thomas Fang
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1457 - 1460
  • [9] Voting for two speaker segmentation
    Narayanaswamy, Balakrishnan
    Gangadharaiah, Rashmi
    Stern, Richard
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2086 - +
  • [10] Location based speaker segmentation
    Lathoud, G
    McCowan, IA
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 176 - 179