The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization

被引:0
|
作者
Zelenak, Martin [1 ]
Hernando, Javier [1 ]
机构
[1] Univ Politecn Cataluna, Barcelona, Spain
关键词
overlapping speech detection; prosody; feature selection; speaker diarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Overlapping speech is responsible for a certain amount of errors produced by standard speaker diarization systems in meeting environment. We are investigating a set of prosody-based long-term features as a potential complement to our overlap detection system relying on short-term spectral parameters. The most relevant features are selected in a two-step process. They are firstly evaluated and sorted according to mRMR criterion and then the optimal number is determined by iterative wrapper approach. We show that the addition of prosodic features decreased overlap detection error. Detected overlap segments are used in speaker diarization to recover missed speech by assigning multiple speaker labels and to increase the purity of speaker clusters.
引用
收藏
页码:1048 / 1051
页数:4
相关论文
共 50 条
  • [21] Speech Enhancement for Multimodal Speaker Diarization System
    Ahmad, Rehan
    Zubair, Syed
    Alquhayz, Hani
    IEEE ACCESS, 2020, 8 : 126671 - 126680
  • [22] Improved Overlapped Speech Handling for Speaker Diarization
    Boakye, Kofi
    Vinyals, Oriol
    Friedland, Gerald
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 948 - +
  • [23] The 2006 Athens information technology speech activity detection and speaker diarization systems
    Rentzeperis, Elias
    Stergiou, Andreas
    Boukis, Christos
    Pnevmatikakis, Aristodemos
    Polymenakos, Lazaros C.
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 385 - +
  • [24] The Influence of Speech Activity Detection and Overlap on Speaker Diarization for Meeting Room Recordings
    Fredouille, Corinne
    Evans, Nicholas
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2704 - 2707
  • [25] END-TO-END SPEAKER DIARIZATION CONDITIONED ON SPEECH ACTIVITY AND OVERLAP DETECTION
    Takashima, Yuki
    Fujita, Yusuke
    Watanabe, Shinji
    Horiguchi, Shota
    Garcia, Paola
    Nagamatsu, Kenji
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 849 - 856
  • [26] Integrated System for Prosodic Features Detection from Speech
    Zbancioc, Marius Dan
    Feraru, Monica
    2014 INTERNATIONAL CONFERENCE AND EXPOSITION ON ELECTRICAL AND POWER ENGINEERING (EPE), 2014, : 114 - 117
  • [27] Analysis and detection of mimicked speech based on prosodic features
    Mary, Leena
    Babu, K.
    Joseph, Aju
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (03) : 407 - 417
  • [28] Speaker Identification in Overlapping Speech
    Tsai, Wei-Ho
    Liao, Shih-Jie
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2010, 26 (05) : 1891 - 1903
  • [29] Harmonic Structure Features for Robust Speaker Diarization
    Zhou, Yu
    Suo, Hongbin
    Li, Junfeng
    Yan, Yonghong
    ETRI JOURNAL, 2012, 34 (04) : 583 - 590
  • [30] FILTERBANK SLOPE BASED FEATURES FOR SPEAKER DIARIZATION
    Madikeri, Srikanth
    Bourlard, Herve
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,