SINGLE-CHANNEL SPEAKER DIARIZATION BASED ON SPATIAL FEATURES

被引:0
|
作者
Hu, Mathieu [1 ]
Parada, Pablo Peso [2 ]
Sharma, Dushyant [2 ]
Doclo, Simon [3 ,4 ]
van Waterschoot, Toon [5 ]
Brookes, Mike [1 ]
Naylor, Patrick A. [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Elect & Elect Engn, London SW7 2AZ, England
[2] Nuance Commun Inc, Voicemail To Text Res, Marlow, Bucks, England
[3] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, D-26111 Oldenburg, Germany
[4] Carl von Ossietzky Univ Oldenburg, Cluster Excellence Hearing4All, D-26111 Oldenburg, Germany
[5] Katholieke Univ Leuven, Dept Elect Engn ESAT STADIUS ETC, Leuven, Belgium
关键词
Speaker diarization; direct-to-reverberant ratio; spatial acoustic features;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker diarization has gained much importance over the past five years in helping overcome key challenges faced by automatic meeting transcription systems. Current state-of-the-art algorithms can only utilize spatial information when multi-microphone recordings are available. In this paper, we propose the novel use of reverberation as a source of spatial information obtained from single-channel recordings to perform speaker diarization. The proposed system is shown to reduce speaker classification errors by 34% when compared with current MFCC based single-channel systems.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
    Zheng, Naijun
    Li, Na
    Yu, JianWei
    Weng, Chao
    Su, Dan
    Liu, XunYing
    Meng, Helen
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7337 - 7341
  • [2] Simultaneous Speech Detection With Spatial Features for Speaker Diarization
    Zelenak, Martin
    Segura, Carlos
    Luque, Jordi
    Hernando, Javier
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 436 - 446
  • [3] Speaker Separation Using Visual Speech Features and Single-channel Audio
    Khan, Faheem
    Milner, Ben
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3263 - 3267
  • [4] FILTERBANK SLOPE BASED FEATURES FOR SPEAKER DIARIZATION
    Madikeri, Srikanth
    Bourlard, Herve
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [5] Speaker Diarization Based on Intensity Channel Contribution
    Barra-Chicote, Roberto
    Manuel Pardo, Jose
    Ferreiros, Javier
    Manuel Montero, Juan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 754 - 761
  • [6] Overlap Detection for Speaker Diarization by Fusing Spectral and Spatial Features
    Zelenak, Martin
    Segura, Carlos
    Hernando, Javier
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2302 - 2305
  • [7] Speaker Verification-Based Evaluation of Single-Channel Speech Separation
    Maciejewski, Matthew
    Watanabe, Shinji
    Khudanpur, Sanjeev
    [J]. INTERSPEECH 2021, 2021, : 3520 - 3524
  • [8] Channel and channel subband selection for speaker diarization
    Ahmed, Ahmed Isam
    Chiverton, John P.
    Ndzi, David L.
    Al-Faris, Mahmoud M.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2022, 75
  • [9] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
    Taherian, Hassan
    Wang, Zhong-Qiu
    Chang, Jorge
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
  • [10] AN ADAPTIVE INITIALIZATION METHOD FOR SPEAKER DIARIZATION BASED ON PROSODIC FEATURES
    Imseng, David
    Friedland, Gerald
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4946 - 4949