Multi-speaker Recognition in Cocktail Party Problem

被引：0

作者：

Wang, Yiqian ^{[1
]}

Sun, Wensheng ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, 10 Xitucheng Rd, Beijing, Peoples R China

来源：

COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS | 2019年 / 463卷

关键词：

Multi-speaker recognition; Cocktail party; Feature extraction; Statistical decision theory;

D O I：

10.1007/978-981-10-6571-2_258

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper proposes an original statistical decision theory to accomplish a multi-speaker recognition task in cocktail party problem. This theory relies on an assumption that the varied frequencies of speakers obey Gaussian distribution and the relationship of their voiceprints can be represented by Euclidean distance vectors. This paper uses Mel-Frequency Cepstral Coefficients to extract the feature of a voice in judging whether a speaker is included in a multi-speaker environment and distinguish who the speaker should be. Finally, a thirteen-dimension constellation drawing is established by mapping from Manhattan distances of speakers in order to take a thorough consideration about gross influential factors.

引用

页码：2116 / 2123

页数：8

共 50 条

[1] Target Speaker Recognition in The Cocktail Party
Jung, Dae-Jin
Cho, Sunyoung
Chun, Tae Yoon
Yoon, Soosung
[J]. 2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1954 - 1958
[2] A hybrid approach to speaker recognition in multi-speaker environment
Trivedi, J
Maitra, A
Mitra, SK
[J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 272 - 275
[3] Fast ICA for Multi-speaker Recognition System
Zhou, Yan
Zhao, Zhiqiang
[J]. ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, 2010, 93 : 507 - 513
[4] MULTI-SPEAKER CONVERSATIONS, CROSS-TALK, AND DIARIZATION FOR SPEAKER RECOGNITION
Sell, Gregory
McCree, Alan
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5425 - 5429
[5] SPEAKER RECOGNITION FOR MULTI-SPEAKER CONVERSATIONS USING X-VECTORS
Snyder, David
Garcia-Romero, Daniel
Sell, Gregory
McCree, Alan
Povey, Daniel
Khudanpur, Sanjeev
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5796 - 5800
[6] END-TO-END MULTI-SPEAKER SPEECH RECOGNITION
Settle, Shane
Le Roux, Jonathan
Hori, Takaaki
Watanabe, Shinji
Hershey, John R.
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4819 - 4823
[7] Research on ASIC for multi-speaker isolated word recognition
Xiong, B
Sun, YH
[J]. 1996 2ND INTERNATIONAL CONFERENCE ON ASIC, PROCEEDINGS, 1996, : 135 - 137
[8] Speech Recognition and Multi-Speaker Diarization of Long Conversations
Mao, Huanru Henry
Li, Shuyang
McAuley, Julian
Cottrell, Garrison W.
[J]. INTERSPEECH 2020, 2020, : 691 - 695
[9] Integration of audio-visual information for multi-speaker multimedia speaker recognition
Yang, Jichen
Chen, Fangfan
Cheng, Yu
Lin, Pei
[J]. DIGITAL SIGNAL PROCESSING, 2024, 145
[10] End-to-End Multilingual Multi-Speaker Speech Recognition
Seki, Hiroshi
Hori, Takaaki
Watanabe, Shinji
Le Roux, Jonathan
Hershey, John R.
[J]. INTERSPEECH 2019, 2019, : 3755 - 3759

← 1 2 3 4 5 →