The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization

被引:0
|
作者
Zelenak, Martin [1 ]
Hernando, Javier [1 ]
机构
[1] Univ Politecn Cataluna, Barcelona, Spain
关键词
overlapping speech detection; prosody; feature selection; speaker diarization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Overlapping speech is responsible for a certain amount of errors produced by standard speaker diarization systems in meeting environment. We are investigating a set of prosody-based long-term features as a potential complement to our overlap detection system relying on short-term spectral parameters. The most relevant features are selected in a two-step process. They are firstly evaluated and sorted according to mRMR criterion and then the optimal number is determined by iterative wrapper approach. We show that the addition of prosodic features decreased overlap detection error. Detected overlap segments are used in speaker diarization to recover missed speech by assigning multiple speaker labels and to increase the purity of speaker clusters.
引用
收藏
页码:1048 / 1051
页数:4
相关论文
共 50 条
  • [41] Online Target Speaker Voice Activity Detection for Speaker Diarization
    Wang, Weiqing
    Lin, Qingjian
    Li, Ming
    INTERSPEECH 2022, 2022, : 1441 - 1445
  • [42] Speaker Diarization and Detection System using A Priori Speaker Information
    Kenai, Ouassila
    Asbai, Nassim
    Ouamour, Siham
    Guerti, Mhania
    Djeghiour, Salim
    2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 73 - 78
  • [43] Speech/Non-Speech Segments Detection Based On Chaotic and Prosodic Features
    Shafiee, Soheil
    Almasganj, Farshad
    Jafari, Ayyoob
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 111 - 114
  • [44] ACCENT DETECTION OF TELUGU SPEECH USING PROSODIC AND FORMANT FEATURES
    Mannepalli, Kasiprasad
    Sastry, P. Nrahari
    Rajesh, V.
    2015 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION ENGINEERING SYSTEMS (SPACES), 2015, : 318 - 322
  • [45] Prosodic features of polite speech
    Brown, Lucien
    Oh, Grace Eunhae
    Idemaru, Kaori
    PRAGMATICS, 2024,
  • [46] Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization
    Geiger, Juergen T.
    Vipperla, Ravichander
    Bozonnet, Simon
    Evans, Nicholas
    Schuller, Bjoern
    Rigoll, Gerhard
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2151 - 2154
  • [47] Joint Discriminative Embedding Learning, Speech Activity and Overlap Detection for the DIHARD Speaker Diarization Challenge
    Miasato Filho, Valter A.
    Silva, Diego A.
    Cuozzo, Luis Gustavo D.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2818 - 2822
  • [48] Revolutionizing Speaker Recognition and Diarization: A Novel Methodology in Speech Analysis
    Ravi D. Shankar
    R. B. Manjula
    Rajashekhar C. Biradar
    SN Computer Science, 6 (1)
  • [49] Neural speech turn segmentation and affinity propagation for speaker diarization
    Yin, Ruiqing
    Bredin, Herve
    Barras, Claude
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1393 - 1397
  • [50] Joint Speech Recognition and Speaker Diarization via Sequence Transduction
    El Shafey, Laurent
    Soltau, Hagen
    Shafran, Izhak
    INTERSPEECH 2019, 2019, : 396 - 400