The Influence of Speech Activity Detection and Overlap on Speaker Diarization for Meeting Room Recordings

被引:0
|
作者
Fredouille, Corinne [1 ]
Evans, Nicholas [1 ]
机构
[1] Univ Avignon, LIA, Avignon, France
关键词
speaker diarization; meeting room; speech activity detection; overlapped speech;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of speaker diarization in the specific context of meeting room recordings which often involve a high degree of spontaneous speech with large overlapped speech segments, speaker noise (laughs, whispers, coughs, etc.) and very short speaker turns. A large variability in signal quality has brought an additional level of complexity. This paper investigates the effects of speech activity detection and overlapped speech through speaker diarization experiments conducted on the NIST RT'05 and RT'06 data sets. Results indicate that our system is highly sensitive to the shape of the initial segmentation and that, perhaps surprisingly, perfect references can even degrade performance. Finally we propose a direction for future research to incorporate confidence values according to acoustic attributes in order to unify what is currently a somewhat disjointed approach to speaker diarization.
引用
收藏
页码:2704 / 2707
页数:4
相关论文
共 50 条
  • [1] END-TO-END SPEAKER DIARIZATION CONDITIONED ON SPEECH ACTIVITY AND OVERLAP DETECTION
    Takashima, Yuki
    Fujita, Yusuke
    Watanabe, Shinji
    Horiguchi, Shota
    Garcia, Paola
    Nagamatsu, Kenji
    [J]. 2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 849 - 856
  • [2] Speaker Diarization of Overlapping Speech based on Silence Distribution in Meeting Recordings
    Yella, Harsha
    Valente, Fabio
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 490 - 493
  • [3] VARIATIONAL BAYESIAN SPEAKER DIARIZATION OF MEETING RECORDINGS
    Valente, Fabio
    Motlicek, Petr
    Vijayasenan, Deepu
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4954 - 4957
  • [4] Speaker Diarization for Meeting Room Audio
    Sun, Hanwu
    Nwe, Tin Lay
    Ma, Bin
    Li, Haizhou
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 888 - 891
  • [5] Robust Speaker Diarization for Short Speech Recordings
    Imseng, David
    Friedland, Gerald
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 432 - +
  • [6] Speech Overlap Detection in a Two-Pass Speaker Diarization System
    Huijbregts, Marijn
    van Leeuwen, David
    de Jong, Franciska
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1047 - +
  • [7] Joint Discriminative Embedding Learning, Speech Activity and Overlap Detection for the DIHARD Speaker Diarization Challenge
    Miasato Filho, Valter A.
    Silva, Diego A.
    Cuozzo, Luis Gustavo D.
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2818 - 2822
  • [8] Overlapping Speech Detection Using Long-Term Conversational Features for Speaker Diarization in Meeting Room Conversations
    Yella, Sree Harsha
    Bourlard, Herve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1688 - 1700
  • [9] IMPROVED OVERLAP SPEECH DIARIZATION OF MEETING RECORDINGS USING LONG-TERM CONVERSATIONAL FEATURES
    Yella, Sree Harsha
    Bourlard, Herve
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7746 - 7750
  • [10] Development of a Filipino Speaker Diarization in Meeting Room Conversations
    De la Cruz, Angelica H.
    Raga Jr, Rodolfo C.
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 462 - 467