Automatic speaker clustering from multi-speaker utterances

被引:1
|
作者
McLaughlin, J [1 ]
Reynolds, D [1 ]
Singer, E [1 ]
O'Leary, GC [1 ]
机构
[1] MIT, Lincoln Lab, Lexington, MA 02420 USA
关键词
D O I
10.1109/ICASSP.1999.759796
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Blind clustering of multi-person utterances by speaker is complicated by the fact that each utterance has at least two talkers. In the case of a two-person conversation, one can simply split each conversation into its respective speaker halves, but this introduces error which ultimately hurts clustering. We propose a clustering algorithm which is capable of associating each conversation with two clusters (and therefore two-speakers) obviating the need for splitting. Results are given for two speaker conversations culled from the Switchboard corpus, and comparisons are made to results obtained on single-speaker utterances. We conclude that although the approach is promising, our technique for computing inter-conversation similarities prior to clustering needs improvement.
引用
收藏
页码:817 / 820
页数:4
相关论文
共 50 条
  • [31] Multi-speaker voice cryptographic key generation
    Paola Garcia-Perera, L.
    Carlos Mex-Perera, J.
    Nolazco-Flores, Juan A.
    [J]. 3RD ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, 2005, 2005,
  • [32] MultiSpeech: Multi-Speaker Text to Speech with Transformer
    Chen, Mingjian
    Tan, Xu
    Ren, Yi
    Xu, Jin
    Sun, Hao
    Zhao, Sheng
    Qin, Tao
    [J]. INTERSPEECH 2020, 2020, : 4024 - 4028
  • [33] Evolutive HMM for multi-speaker tracking system
    Meignier, S
    Bonastre, JF
    Fredouille, C
    Merlin, T
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1201 - 1204
  • [34] Multi-speaker Recognition in Cocktail Party Problem
    Wang, Yiqian
    Sun, Wensheng
    [J]. COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2019, 463 : 2116 - 2123
  • [35] Multi-speaker Beamforming for Voice Activity Classification
    Tran, Thuy N.
    Cowley, William
    Pollok, Andre
    [J]. 2013 AUSTRALIAN COMMUNICATIONS THEORY WORKSHOP (AUSCTW), 2013, : 116 - 121
  • [36] AN INVESTIGATION OF MULTI-SPEAKER TRAINING FORWAVENET VOCODER
    Hayashi, Tomoki
    Tamamori, Akira
    Kobayashi, Kazuhiro
    Takeda, Kazuya
    Toda, Tomoki
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 712 - 718
  • [37] Multi-speaker experimental designs: Methodological considerations
    Offrede, Tom
    Fuchs, Susanne
    Mooshammer, Christine
    [J]. LANGUAGE AND LINGUISTICS COMPASS, 2021, 15 (12):
  • [38] ForumSum: A Multi-Speaker Conversation Summarization Dataset
    Khalman, Misha
    Zhao, Yao
    Saleh, Mohammad
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4592 - 4599
  • [39] SPEAKER CONDITIONING OF ACOUSTIC MODELS USING AFFINE TRANSFORMATION FOR MULTI-SPEAKER SPEECH RECOGNITION
    Yousefi, Midia
    Hansen, John H. L.
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 283 - 288
  • [40] Speaker Diarization in a Multi-Speaker Environment Using Particle Swarm Optimization and Mutual Information
    Mirrezaie, S. M.
    Ahadi, S. M.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1533 - 1536