RELIABLE ACCENT SPECIFIC UNIT GENERATION WITH DYNAMIC GAUSSIAN MIXTURE SELECTION FOR MULTI-ACCENT SPEECH RECOGNITION

被引:0
|
作者
Zhang, Chao [1 ,2 ]
Liu, Yi [1 ]
Xia, Yunqing [1 ]
Zheng, Thomas Fang [1 ]
Olsen, Jesper [3 ]
Tian, JiLei [3 ]
机构
[1] Tsinghua Natl Lab Informat Sci & Technol, Ctr Speech & Language Technol, Div Technol Innovat & Dev, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
[3] Nokia Res Ctr, Beijing, Peoples R China
关键词
Reliable Accent Specific Unit; Dynamic Gaussian Mixture Selection Scheme; Multiple Accents;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multiple accents are often present in Mandarin speech, as most Chinese have learned Mandarin as a second language. We propose generating reliable accent specific unit together with dynamic Gaussian mixture selection for multi-accent speech recognition. Time alignment phoneme recognition is used to generate such unit and to model accent variations explicitly and accurately. Dynamic Gaussian mixture selection scheme builds a dynamical observation density for each specified frame in decoding, and leads to use Gaussian mixture component efficiently. This method increases the covering ability for a diversity of accent variations in multi-accent, and alleviates the performance degradation caused by pruned beam search without augmenting the model size. The effectiveness of this approach is evaluated on three typical Chinese accents Chuan, Yue and Wu. Our approach outperforms traditional acoustic model reconstruction approach significantly by 6.30%, 4.93% and 5.53%, respectively on Syllable Error Rate (SER) reduction, without degrading on standard speech.
引用
收藏
页数:6
相关论文
共 35 条
  • [1] Reliable Accent-Specific Unit Generation With Discriminative Dynamic Gaussian Mixture Selection for Multi-Accent Chinese Speech Recognition
    Zhang, Chao
    Liu, Yi
    Xia, Yunqing
    Wang, Xuan
    Lee, Chin-Hui
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2073 - 2084
  • [2] DISCRIMINATIVE DYNAMIC GAUSSIAN MIXTURE SELECTION WITH ENHANCED ROBUSTNESS AND PERFORMANCE FOR MULTI-ACCENT SPEECH RECOGNITION
    Zhang, Chao
    Liu, Yi
    Xia, Yunqing
    Lee, Chin-Hui
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4749 - 4752
  • [3] Multi-Accent Chinese Speech Recognition
    Liu Yi
    Fung, Pascale
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 133 - +
  • [4] A Multi-Accent Acoustic Model using Mixture of Experts for Speech Recognition
    Jain, Abhinav
    Singh, Vishwanath P.
    Rath, Shakti P.
    [J]. INTERSPEECH 2019, 2019, : 779 - 783
  • [5] END-TO-END MULTI-ACCENT SPEECH RECOGNITION WITH UNSUPERVISED ACCENT MODELLING
    Li, Song
    Ouyang, Beibei
    Liao, Dexin
    Xia, Shipeng
    Li, Lin
    Hong, Qingyang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6418 - 6422
  • [6] Multi-Accent and Accent-Independent Non-Native Speech Recognition
    Bouselmi, Ghazi
    Fohr, Dominique
    Illina, Irina
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2703 - +
  • [7] JOINT MODELING OF ACCENTS AND ACOUSTICS FOR MULTI-ACCENT SPEECH RECOGNITION
    Yang, Xuesong
    Audhkhasi, Kartik
    Rosenberg, Andrew
    Thomas, Samuel
    Ramabhadran, Bhuvana
    Hasegawa-Johnson, Mark
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5989 - 5993
  • [8] MULTI-ACCENT SPEECH RECOGNITION WITH HIERARCHICAL GRAPHEME BASED MODELS
    Rao, Kanishka
    Sak, Hasim
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4815 - 4819
  • [9] Investigations of Low Resource Multi-Accent Mandarin Speech Recognition
    Wang, Wei
    Xu, Wenying
    Sui, Xiang
    Wang, Lan
    Liu, Xunying
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 62 - 66
  • [10] Adaptive Attention Network with Domain Adversarial Training for Multi-Accent Speech Recognition
    Yang, Yanbing
    Shi, Hao
    Lin, Yuqin
    Ge, Meng
    Wang, Longbiao
    Hou, Qingzhi
    Dang, Jianwu
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 6 - 10