Speaker diarization for multi-microphone meetings using only between-channel differences

被引:0
|
作者
Pardo, Jose M. [1 ,2 ]
Anguera, Xavier [1 ,3 ]
Wooters, Chuck [1 ]
机构
[1] Int Comp Sci Inst, Berkeley, CA 94708 USA
[2] Univ Politecn Madrid, E-28040 Madrid, Spain
[3] Tech Univ Catalonia, Barcelona, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to extract speaker turn segmentation from multiple distant microphones (MDM) using only delay values found via a cross-correlation between the available channels. The method is robust against the number of speakers (which is unknown to the system), the number of channels, and the acoustics of the room. The delays between channels are processed and clustered to obtain a segmentation hypothesis. We have obtained a 31.2% diarization error rate (DER) for the NIST's RT05s MDM conference room evaluation set. For a MDM subset of NIST's RT04s development set, we have obtained 36.93% DER and 35.73% DER*. Comparing those results with the ones presented by Ellis and Liu [8], who also used between-channels differences for the same data, we have obtained 43% relative improvement in the error rate.
引用
收藏
页码:257 / +
页数:3
相关论文
共 50 条
  • [1] MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
    Zheng, Naijun
    Li, Na
    Yu, JianWei
    Weng, Chao
    Su, Dan
    Liu, XunYing
    Meng, Helen
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7337 - 7341
  • [2] Speaker Diarization for Multiple Distant Microphone Meetings: Mixing Acoustic Features And Inter-Channel Time Differences
    Pardo, Jose M.
    Anguera, Xavier
    Wooters, Chuck
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2194 - 2197
  • [3] Robust speaker identification using multi-microphone systems
    Barger, P
    Sridharan, S
    [J]. IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 261 - 264
  • [4] Text-independent speaker identification using soft channel selection in a multi-microphone environment
    Ji, Mikyong
    Kim, Sungtak
    Kim, Hoirin
    Yoon, Ho-Sub
    [J]. 2008 DIGEST OF TECHNICAL PAPERS INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 2008, : 456 - 457
  • [5] Speaker diarization for multiple-distant-microphone meetings using several sources of information
    Pardo, Jose M.
    Anguera, Xavier
    Wooters, Charles
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2007, 56 (09) : 1212 - 1224
  • [6] Speaker diarization for multi-party meetings using acoustic fusion
    Anguera, X
    Wooters, C
    Hernando, J
    [J]. 2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 426 - 431
  • [7] Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
    Hung, Hayley
    Huang, Yan
    Friedland, Gerald
    Gatica-Perez, Daniel
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 847 - 860
  • [8] Multi-Microphone Speaker Separation based on Deep DOA Estimation
    Chazan, Shlomo E.
    Hammer, Hodaya
    Hazan, Gershon
    Goldberger, Jacob
    Gannot, Sharon
    [J]. 2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [9] Data-Driven Multi-Microphone Speaker Localization on Manifolds
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    [J]. FOUNDATIONS AND TRENDS IN SIGNAL PROCESSING, 2020, 14 (1-2): : 1 - 161
  • [10] Multi-Stream Speaker Diarization Systems for the Meetings Domain
    Gallardo-Antolin, Ascension
    Anguera, Xavier
    Wooters, Chuck
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2186 - +