END-TO-END MULTI-ACCENT SPEECH RECOGNITION WITH UNSUPERVISED ACCENT MODELLING

被引:6
|
作者
Li, Song [1 ]
Ouyang, Beibei [1 ]
Liao, Dexin [2 ]
Xia, Shipeng [2 ]
Li, Lin [1 ]
Hong, Qingyang [2 ]
机构
[1] Xiamen Univ, Sch Elect Sci & Technol, Xiamen, Peoples R China
[2] Xiamen Univ, Sch Informat, Xiamen, Peoples R China
基金
中国国家自然科学基金;
关键词
End-to-end; speech recognition; multi-accent; global embedding;
D O I
10.1109/ICASSP39728.2021.9414833
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
End-to-end speech recognition has achieved good recognition performance on standard English pronunciation datasets. However, one prominent problem with end-to-end speech recognition systems is that non-native English speakers tend to have complex and varied accents, which reduces the accuracy of English speech recognition in different countries. In order to grapple with such an issue, we first investigate and improve the current mainstream end-to-end multi-accent speech recognition technologies. In addition, we propose two unsupervised accent modelling methods, which convert accent information into a global embedding, and use it to improve the performance of the end-to-end multi-accent speech recognition systems. Experimental results on accented English datasets of eight countries (AESRC2020) show that, compared with the Transformer baseline, our proposed methods achieve relative 14.8% and 15.4% average word error rate (WER) reduction in the development set and evaluation set, respectively.
引用
收藏
页码:6418 / 6422
页数:5
相关论文
共 50 条
  • [1] Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition
    Gong, Xun
    Lu, Yizhou
    Zhou, Zhikai
    Qian, Yanmin
    [J]. INTERSPEECH 2021, 2021, : 1274 - 1278
  • [2] Layer-Wise Fast Adaptation for End to End Multi-Accent Speech Recognition
    Qian, Yanmin
    Gong, Xun
    Huang, Houjun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2842 - 2853
  • [3] Multi-Accent Chinese Speech Recognition
    Liu Yi
    Fung, Pascale
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 133 - +
  • [4] Domain Expansion for End-to-End Speech Recognition: Applications for Accent/Dialect Speech
    Ghorbani, Shahram
    Hansen, John H. L.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 762 - 774
  • [5] Multi-Accent and Accent-Independent Non-Native Speech Recognition
    Bouselmi, Ghazi
    Fohr, Dominique
    Illina, Irina
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2703 - +
  • [6] JOINT MODELING OF ACCENTS AND ACOUSTICS FOR MULTI-ACCENT SPEECH RECOGNITION
    Yang, Xuesong
    Audhkhasi, Kartik
    Rosenberg, Andrew
    Thomas, Samuel
    Ramabhadran, Bhuvana
    Hasegawa-Johnson, Mark
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5989 - 5993
  • [7] MULTI-ACCENT SPEECH RECOGNITION WITH HIERARCHICAL GRAPHEME BASED MODELS
    Rao, Kanishka
    Sak, Hasim
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4815 - 4819
  • [8] How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems
    Prasad, Archiki
    Jyothi, Preethi
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3739 - 3753
  • [9] Investigations of Low Resource Multi-Accent Mandarin Speech Recognition
    Wang, Wei
    Xu, Wenying
    Sui, Xiang
    Wang, Lan
    Liu, Xunying
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 62 - 66
  • [10] AN END-TO-END SPEECH ACCENT RECOGNITION METHOD BASED ON HYBRID CTC/ATTENTION TRANSFORMER ASR
    Gao, Qiang
    Wu, Haiwei
    Sun, Yanqing
    Duan, Yitao
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7253 - 7257