Multi-Speaker Adaptation for Robust Speech Recognition under Ubiquitous Environment

被引:0
|
作者
Shih, Po-Yi [1 ]
Wang, Jhing-Fa [1 ]
Lin, Yuan-Ning [1 ]
Fu, Zhong-Hua [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Elect Engn, Tainan 70101, Taiwan
关键词
SUPPORT VECTOR MACHINES; SPEAKER VERIFICATION; NEURAL-NETWORK; IDENTIFICATION; MODELS; SYSTEMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a multi-speaker adaptation for robust speech recognition under ubiquitous environment. The goal is to adapt the speech recognition model for each speaker correctly in ubiquitous multi-speaker environment. We integrate speaker recognition and unsupervised speaker adaptation method to promote the speech recognition performances. Specifically we employ a confidence measure to reduce the possible negative adaptation caused by the environment noise or the recognition errors. The experimental results show that the proposed framework can efficiently promote the average recognition accuracy to 80-90% for multi-speaker ubiquitous speech recognition.
引用
收藏
页码:126 / 131
页数:6
相关论文
共 50 条
  • [1] Sparse Component Analysis for Speech Recognition in Multi-Speaker Environment
    Asaei, Afsaneh
    Bourlard, Herve
    Garner, Philip N.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1704 - 1707
  • [2] Automatic Multi-Speaker Speech Recognition System Based on Time-Frequency Blind Source Separation under Ubiquitous Environment
    Wang, Zhe
    Zhang, Haijian
    Bi, Guoan
    Li, Xiumei
    [J]. PROCEEDINGS OF THE 2014 9TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2014, : 101 - +
  • [3] A hybrid approach to speaker recognition in multi-speaker environment
    Trivedi, J
    Maitra, A
    Mitra, SK
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 272 - 275
  • [4] END-TO-END MULTI-SPEAKER SPEECH RECOGNITION
    Settle, Shane
    Le Roux, Jonathan
    Hori, Takaaki
    Watanabe, Shinji
    Hershey, John R.
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4819 - 4823
  • [5] Speech Recognition and Multi-Speaker Diarization of Long Conversations
    Mao, Huanru Henry
    Li, Shuyang
    McAuley, Julian
    Cottrell, Garrison W.
    [J]. INTERSPEECH 2020, 2020, : 691 - 695
  • [6] End-to-End Multilingual Multi-Speaker Speech Recognition
    Seki, Hiroshi
    Hori, Takaaki
    Watanabe, Shinji
    Le Roux, Jonathan
    Hershey, John R.
    [J]. INTERSPEECH 2019, 2019, : 3755 - 3759
  • [7] END-TO-END MULTI-SPEAKER SPEECH RECOGNITION WITH TRANSFORMER
    Chang, Xuankai
    Zhang, Wangyou
    Qian, Yanmin
    Le Roux, Jonathan
    Watanabe, Shinji
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6134 - 6138
  • [8] Joint speaker and environment adaptation using Tensor Voice for robust speech recognition
    Jeong, Yongwon
    [J]. SPEECH COMMUNICATION, 2014, 58 : 1 - 10
  • [9] PHONEME DEPENDENT SPEAKER EMBEDDING AND MODEL FACTORIZATION FOR MULTI-SPEAKER SPEECH SYNTHESIS AND ADAPTATION
    Fu, Ruibo
    Tao, Jianhua
    Wen, Zhengqi
    Zheng, Yibin
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6930 - 6934
  • [10] SYNTHESIZING DYSARTHRIC SPEECH USING MULTI-SPEAKER TTS FOR DYSARTHRIC SPEECH RECOGNITION
    Soleymanpour, Mohammad
    Johnson, Michael T.
    Soleymanpour, Rahim
    Berry, Jeffrey
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7382 - 7386