Speech Recognition System Based on OLLO French Corpus by Using MFCCs

被引:2
|
作者
Youcef, Braham Chaouche [1 ]
Elemine, Yessaad Mohamed [1 ]
Islam, Benmaiza [2 ]
Farid, Bouttout [3 ]
机构
[1] Univ Mohamed El Bachir El Ibrahimi, Dept Elect, LMSE Lab, Bordj Bou Arreridj, Algeria
[2] USTHB, Fac Elect & Comp Sci, Lab Spoken Commun & Signal Proc, Algiers 16000, Algeria
[3] Univ Constantine, Lab Signal Proc, Dept Elect, Constantine 25000, Algeria
关键词
ASR system; HMM; MFCC; GMM; OLLO; HTK;
D O I
10.1007/978-3-319-48929-2_25
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The automatic speech recognition is an area of active study since the early 1950s, and the latest technologies in the field of stochastic processes and the discovery of Hidden Markov Models have given a new direction for this area. This paper describes an approach of speech recognition by using the Mel-Scale Frequency Cepstral Coefficients (MFCC) from speech recognition experiments done on OLLO French corpus by different features. Our work consists in finding the most appropriate choice for this task using the Mel-Scale Frequency Cepstral Coefficients (MFCC) extracted from speech signal. To evaluate this analysis, we built an ASR reference system based on the modeling of phonemes by the HMM (Hidden Markov Models) associated with the GMM models (Gaussian Mixture Model) using the HTK tool. The implementation of this system was made using several experiments in order to choose the best parameters used in two main steps to build an ASR system, acoustic analysis and decoding. The experiments show that the choice of 25 Gaussian components provides a good compromise between recognition accuracy and computation time, and we found also that the best parameters leading to good recognition accuracy are MFCC_ E_ D_ A coefficients with 92.5%. In this paper the quality and testing of speaker recognition and gender recognition system is completed and analysed.
引用
收藏
页码:326 / 331
页数:6
相关论文
共 50 条
  • [1] A speech recognition and speech corpus system based on Matlab
    He, Q
    Zhang, YW
    [J]. PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 559 - 562
  • [2] Speech Recognition Combining MFCCs and Image Features
    Karlos, Stamatis
    Fazakis, Nikos
    Karanikola, Katerina
    Kotsiantis, Sotiris
    Sgarbas, Kyriakos
    [J]. SPEECH AND COMPUTER, 2016, 9811 : 651 - 658
  • [3] Auditory Model Based Optimization of MFCCs Improves Automatic Speech Recognition Performance
    Chatterjee, Saikat
    Koniaris, Christos
    Kleijn, W. Bastiaan
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2943 - 2946
  • [4] Vowel Recognition from Telephonic Speech Using MFCCs and Gaussian Mixture Models
    Koolagudi, Shashidhar G.
    Thakur, Sujata Negi
    Barthwal, Anurag
    Singh, Manoj Kumar
    Rawat, Ramesh
    Rao, K. Sreenivasa
    [J]. ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2012, 305 : 170 - +
  • [5] Scale-invariant MFCCs for speech/speaker recognition
    Tufekci, Zekeriya
    Disken, Gokay
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (05) : 3758 - 3762
  • [6] Speech emotion recognition using MFCCs extracted from a mobile terminal based on ETSI front end
    Beritelli, Francesco
    Casale, Salvatore
    Russo, Alessandra
    Serrano, Salvatore
    [J]. 2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1607 - +
  • [7] Channel Robust MFCCs for Continuous Speech Speaker Recognition
    Chougule, Sharada Vikram
    Chavan, Mahesh S.
    [J]. ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2014, 264 : 557 - 568
  • [8] Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
    Alotaibi, Yousef Ajami
    Alghamdi, Mansour
    Alotaiby, Fabad
    [J]. IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 122 - +
  • [9] An automatic speech recognition system for spontaneous Punjabi speech corpus
    Kumar Y.
    Singh N.
    [J]. International Journal of Speech Technology, 2017, 20 (2) : 297 - 303
  • [10] Speech recognition using energy, MFCCs and Rho parameters to classify syllables in the Spanish language
    Suarez Guerra, Sergio
    Oropeza Rodriguez, Jose Luis
    Felipe Riveron, Edgardo Manuel
    Figueroa Nazuno, Jesus
    [J]. MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 1057 - +