JOINT MODELING OF ACCENTS AND ACOUSTICS FOR MULTI-ACCENT SPEECH RECOGNITION

被引:0
|
作者
Yang, Xuesong [1 ]
Audhkhasi, Kartik [2 ]
Rosenberg, Andrew [2 ]
Thomas, Samuel [2 ]
Ramabhadran, Bhuvana [2 ]
Hasegawa-Johnson, Mark [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
关键词
End-to-end models; acoustic modeling; multi-accent speech recognition; multi-task learning; ENGLISH;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The performance of automatic speech recognition systems degrades with increasing mismatch between the training and testing scenarios. Differences in speaker accents are a significant source of such mismatch. The traditional approach to deal with multiple accents involves pooling data from several accents during training and building a single model in multi-task fashion, where tasks correspond to individual accents. In this paper, we explore an alternate model where we jointly learn an accent classifier and a multi-task acoustic model. Experiments on the American English Wall Street Journal and British English Cambridge corpora demonstrate that our joint model outperforms the strong multi-task acoustic model baseline. We obtain a 5.94% relative improvement in word error rate on British English, and 9.47% relative improvement on American English. This illustrates that jointly modeling with accent information improves acoustic model performance.
引用
收藏
页码:5989 / 5993
页数:5
相关论文
共 50 条
  • [1] Multi-Accent Chinese Speech Recognition
    Liu Yi
    Fung, Pascale
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 133 - +
  • [2] END-TO-END MULTI-ACCENT SPEECH RECOGNITION WITH UNSUPERVISED ACCENT MODELLING
    Li, Song
    Ouyang, Beibei
    Liao, Dexin
    Xia, Shipeng
    Li, Lin
    Hong, Qingyang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6418 - 6422
  • [3] MULTI-ACCENT SPEECH RECOGNITION WITH HIERARCHICAL GRAPHEME BASED MODELS
    Rao, Kanishka
    Sak, Hasim
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4815 - 4819
  • [4] Multi-Accent and Accent-Independent Non-Native Speech Recognition
    Bouselmi, Ghazi
    Fohr, Dominique
    Illina, Irina
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2703 - +
  • [5] Investigations of Low Resource Multi-Accent Mandarin Speech Recognition
    Wang, Wei
    Xu, Wenying
    Sui, Xiang
    Wang, Lan
    Liu, Xunying
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 62 - 66
  • [6] A Multi-Accent Acoustic Model using Mixture of Experts for Speech Recognition
    Jain, Abhinav
    Singh, Vishwanath P.
    Rath, Shakti P.
    [J]. INTERSPEECH 2019, 2019, : 779 - 783
  • [7] Adaptive Attention Network with Domain Adversarial Training for Multi-Accent Speech Recognition
    Yang, Yanbing
    Shi, Hao
    Lin, Yuqin
    Ge, Meng
    Wang, Longbiao
    Hou, Qingzhi
    Dang, Jianwu
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 6 - 10
  • [8] Layer-Wise Fast Adaptation for End to End Multi-Accent Speech Recognition
    Qian, Yanmin
    Gong, Xun
    Huang, Houjun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2842 - 2853
  • [9] RELIABLE ACCENT SPECIFIC UNIT GENERATION WITH DYNAMIC GAUSSIAN MIXTURE SELECTION FOR MULTI-ACCENT SPEECH RECOGNITION
    Zhang, Chao
    Liu, Yi
    Xia, Yunqing
    Zheng, Thomas Fang
    Olsen, Jesper
    Tian, JiLei
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [10] Multi-accent speech recognition of Afrikaans, Black and White varieties of South African English
    Kamper, Herman
    Niesler, Thomas
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3196 - 3199