Noise and speaker robustness in a Persian continuous speech recognition system

被引:0
|
作者
Veisi, Hadi [1 ]
Sameti, Hossein [1 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper VTLN speaker normalization, MLLR and MAP adaptation methods are investigated in a Persian HMM-based speaker independent large vocabulary continuous speech recognition system. Speaker and environmental noise robustness are achieved in real world applications for this system. 4 search-based method is used in VTLN to find speaker relative warping factors. The warping factors are applied to signal's spectrum to normalize the variation effect of VTL between speakers. In the MLLR framework, Gaussian mean and covariance transformations in global and full adaptation are experienced. In this method, regression tree based adaptation in batch-supervised fashion is used. Also the standard MAP is experienced as an adaptation method. Combinations of these approaches with CAN robust feature method are evaluated on 4 different tasks. Significant improvement is achieved in the recognition performance in noisy environments such that it makes the system operational in real applications.
引用
收藏
页码:73 / 76
页数:4
相关论文
共 50 条
  • [1] Discriminative speaker adaptation in Persian continuous speech recognition systems
    Pirhosseinloo, Shadi
    Ganj, Farshad Almas
    [J]. 4TH INTERNATIONAL CONFERENCE OF COGNITIVE SCIENCE, 2012, 32 : 296 - 301
  • [2] Nevisa, a Persian Continuous Speech Recognition System
    Sameti, Hossein
    Veisi, Hadi
    Bahrani, Mohammad
    Babaali, Bagher
    Hosseinzadeh, Khosro
    [J]. ADVANCES IN COMPUTER SCIENCE AND ENGINEERING, 2008, 6 : 485 - 492
  • [3] Continuous Speech Recognition and Identification of the Speaker System
    Guffanti, Diego
    Martinez, Danilo
    Paladines, Jose
    Sarmiento, Andrea
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY & SYSTEMS (ICITS 2018), 2018, 721 : 767 - 776
  • [4] Speaker adaptation by modeling the speaker variation in a continuous speech recognition system
    Strom, N
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 989 - 992
  • [5] Improving Noise Robustness of Speech Emotion Recognition System
    Juszkiewicz, Lukasz
    [J]. INTELLIGENT DISTRIBUTED COMPUTING VII, 2014, 511 : 223 - 232
  • [6] Improving Robustness to Compressed Speech in Speaker Recognition
    McLaren, Mitchell
    Abrash, Victor
    Graciarena, Martin
    Lei, Yun
    Pesan, Jan
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3665 - 3669
  • [7] A large vocabulary continuous speech recognition system for Persian language
    Sameti, Hossein
    Veisi, Hadi
    Bahrani, Mohammad
    Babaali, Bagher
    Hosseinzadeh, Khosro
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2011, : 1 - 12
  • [8] A large vocabulary continuous speech recognition system for Persian language
    Hossein Sameti
    Hadi Veisi
    Mohammad Bahrani
    Bagher Babaali
    Khosro Hosseinzadeh
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2011
  • [9] SPEAKER-INDEPENDENT VOWEL RECOGNITION IN PERSIAN SPEECH
    Nazari, Mohammad
    Sayadiyan, Abolghasem
    Valiollahzadeh, Seyyed Majid
    [J]. 2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 672 - 676
  • [10] Toward noise robustness speech recognition
    Namarvar, HH
    Liaw, J
    Berger, TW
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4016 - 4016