Speech Recognition System Based on OLLO French Corpus by Using MFCCs

被引：2

作者：

Youcef, Braham Chaouche ^{[1
]}

Elemine, Yessaad Mohamed ^{[1
]}

Islam, Benmaiza ^{[2
]}

Farid, Bouttout ^{[3
]}

机构：

[1] Univ Mohamed El Bachir El Ibrahimi, Dept Elect, LMSE Lab, Bordj Bou Arreridj, Algeria

[2] USTHB, Fac Elect & Comp Sci, Lab Spoken Commun & Signal Proc, Algiers 16000, Algeria

[3] Univ Constantine, Lab Signal Proc, Dept Elect, Constantine 25000, Algeria

来源：

RECENT ADVANCES IN ELECTRICAL ENGINEERING AND CONTROL APPLICATIONS | 2017年 / 411卷

关键词：

ASR system; HMM; MFCC; GMM; OLLO; HTK;

D O I：

10.1007/978-3-319-48929-2_25

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

The automatic speech recognition is an area of active study since the early 1950s, and the latest technologies in the field of stochastic processes and the discovery of Hidden Markov Models have given a new direction for this area. This paper describes an approach of speech recognition by using the Mel-Scale Frequency Cepstral Coefficients (MFCC) from speech recognition experiments done on OLLO French corpus by different features. Our work consists in finding the most appropriate choice for this task using the Mel-Scale Frequency Cepstral Coefficients (MFCC) extracted from speech signal. To evaluate this analysis, we built an ASR reference system based on the modeling of phonemes by the HMM (Hidden Markov Models) associated with the GMM models (Gaussian Mixture Model) using the HTK tool. The implementation of this system was made using several experiments in order to choose the best parameters used in two main steps to build an ASR system, acoustic analysis and decoding. The experiments show that the choice of 25 Gaussian components provides a good compromise between recognition accuracy and computation time, and we found also that the best parameters leading to good recognition accuracy are MFCC_ E_ D_ A coefficients with 92.5%. In this paper the quality and testing of speaker recognition and gender recognition system is completed and analysed.

引用

页码：326 / 331

页数：6

共 50 条

[1] A speech recognition and speech corpus system based on Matlab
He, Q
Zhang, YW
[J]. PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2001, : 559 - 562
[2] Speech Recognition Combining MFCCs and Image Features
Karlos, Stamatis
Fazakis, Nikos
Karanikola, Katerina
Kotsiantis, Sotiris
Sgarbas, Kyriakos
[J]. SPEECH AND COMPUTER, 2016, 9811 : 651 - 658
[3] Auditory Model Based Optimization of MFCCs Improves Automatic Speech Recognition Performance
Chatterjee, Saikat
Koniaris, Christos
Kleijn, W. Bastiaan
[J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2943 - 2946
[4] Vowel Recognition from Telephonic Speech Using MFCCs and Gaussian Mixture Models
Koolagudi, Shashidhar G.
Thakur, Sujata Negi
Barthwal, Anurag
Singh, Manoj Kumar
Rawat, Ramesh
Rao, K. Sreenivasa
[J]. ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2012, 305 : 170 - +
[5] Scale-invariant MFCCs for speech/speaker recognition
Tufekci, Zekeriya
Disken, Gokay
[J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (05) : 3758 - 3762
[6] Speech emotion recognition using MFCCs extracted from a mobile terminal based on ETSI front end
Beritelli, Francesco
Casale, Salvatore
Russo, Alessandra
Serrano, Salvatore
[J]. 2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1607 - +
[7] Channel Robust MFCCs for Continuous Speech Speaker Recognition
Chougule, Sharada Vikram
Chavan, Mahesh S.
[J]. ADVANCES IN SIGNAL PROCESSING AND INTELLIGENT RECOGNITION SYSTEMS, 2014, 264 : 557 - 568
[8] Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
Alotaibi, Yousef Ajami
Alghamdi, Mansour
Alotaiby, Fabad
[J]. IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 122 - +
[9] An automatic speech recognition system for spontaneous Punjabi speech corpus
Kumar Y.
Singh N.
[J]. International Journal of Speech Technology, 2017, 20 (2) : 297 - 303
[10] Speech recognition using energy, MFCCs and Rho parameters to classify syllables in the Spanish language
Suarez Guerra, Sergio
Oropeza Rodriguez, Jose Luis
Felipe Riveron, Edgardo Manuel
Figueroa Nazuno, Jesus
[J]. MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 1057 - +

← 1 2 3 4 5 →