Multi-speaker Recognition in Cocktail Party Problem

被引:0
|
作者
Wang, Yiqian [1 ]
Sun, Wensheng [1 ]
机构
[1] Beijing Univ Posts & Telecommun, 10 Xitucheng Rd, Beijing, Peoples R China
关键词
Multi-speaker recognition; Cocktail party; Feature extraction; Statistical decision theory;
D O I
10.1007/978-981-10-6571-2_258
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes an original statistical decision theory to accomplish a multi-speaker recognition task in cocktail party problem. This theory relies on an assumption that the varied frequencies of speakers obey Gaussian distribution and the relationship of their voiceprints can be represented by Euclidean distance vectors. This paper uses Mel-Frequency Cepstral Coefficients to extract the feature of a voice in judging whether a speaker is included in a multi-speaker environment and distinguish who the speaker should be. Finally, a thirteen-dimension constellation drawing is established by mapping from Manhattan distances of speakers in order to take a thorough consideration about gross influential factors.
引用
收藏
页码:2116 / 2123
页数:8
相关论文
共 50 条
  • [1] Target Speaker Recognition in The Cocktail Party
    Jung, Dae-Jin
    Cho, Sunyoung
    Chun, Tae Yoon
    Yoon, Soosung
    [J]. 2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1954 - 1958
  • [2] A hybrid approach to speaker recognition in multi-speaker environment
    Trivedi, J
    Maitra, A
    Mitra, SK
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 272 - 275
  • [3] Fast ICA for Multi-speaker Recognition System
    Zhou, Yan
    Zhao, Zhiqiang
    [J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, 2010, 93 : 507 - 513
  • [4] MULTI-SPEAKER CONVERSATIONS, CROSS-TALK, AND DIARIZATION FOR SPEAKER RECOGNITION
    Sell, Gregory
    McCree, Alan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5425 - 5429
  • [5] SPEAKER RECOGNITION FOR MULTI-SPEAKER CONVERSATIONS USING X-VECTORS
    Snyder, David
    Garcia-Romero, Daniel
    Sell, Gregory
    McCree, Alan
    Povey, Daniel
    Khudanpur, Sanjeev
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5796 - 5800
  • [6] END-TO-END MULTI-SPEAKER SPEECH RECOGNITION
    Settle, Shane
    Le Roux, Jonathan
    Hori, Takaaki
    Watanabe, Shinji
    Hershey, John R.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4819 - 4823
  • [7] Research on ASIC for multi-speaker isolated word recognition
    Xiong, B
    Sun, YH
    [J]. 1996 2ND INTERNATIONAL CONFERENCE ON ASIC, PROCEEDINGS, 1996, : 135 - 137
  • [8] Speech Recognition and Multi-Speaker Diarization of Long Conversations
    Mao, Huanru Henry
    Li, Shuyang
    McAuley, Julian
    Cottrell, Garrison W.
    [J]. INTERSPEECH 2020, 2020, : 691 - 695
  • [9] Integration of audio-visual information for multi-speaker multimedia speaker recognition
    Yang, Jichen
    Chen, Fangfan
    Cheng, Yu
    Lin, Pei
    [J]. DIGITAL SIGNAL PROCESSING, 2024, 145
  • [10] End-to-End Multilingual Multi-Speaker Speech Recognition
    Seki, Hiroshi
    Hori, Takaaki
    Watanabe, Shinji
    Le Roux, Jonathan
    Hershey, John R.
    [J]. INTERSPEECH 2019, 2019, : 3755 - 3759