Spoken Arabic Digits Recognition Using Deep Learning

被引：0

作者：

Wazir, Abdulaziz Saleh Mahfoudh B. A. ^{[1
]}

Chuah, Joon Huang ^{[1
]}

机构：

[1] Univ Malaya, VIP Res Lab, Dept Elect Engn, Fac Engn, Kuala Lumpur 50603, Malaysia

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND INTELLIGENT SYSTEMS (I2CACIS) | 2019年

关键词：

Arabic digits; speech recognition; Deep Learning; Recurrent Neural Network (RNN); Long Short-Term Memory (LSTM);

D O I：

10.1109/i2cacis.2019.8825004

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Speech recognition has undergone tremendous advancement over the past 50 years. Deep Neural Network (DNN) is one of the most popular methods for speech analysis thanks to its ability to minimize error rate for optimization problems. This research proposes an Arabic digits speech recognition model utilizing Recurrent Neural Network (RNN). The speech recognition model select the finest speech signal representation by feature extraction of Mel-Frequency Cepstrum Coefficients (MFCCs) after having been processed for noise reduction and digits separation. Extracted features from speech of digit are fed into a network with Long Short-Term Memory (LSTM) cells. The LSTM cells have the capability to solve problems associated with temporal dependencies requiring long-term learning and solve the vanishing gradient problems associated with RNN. A dataset of 1040 samples of spoken Arabic digits from different dialects are used in this study where 840 samples used to train the network and another 200 samples are used for testing purpose. The model training is carried out using a computing system with Graphics Processing Unit (GPU). The LSTM model learning parameters is tuned for optimization purpose achieving a higher accuracy of 94% during model training. The testing results of the tuned parameters model shows that the LSTM model can achieve 69% in accuracy when recognizing spoken Arabic digits. The model has the highest accuracy, i.e. 80%, when recognizing the digit zero.

引用

页码：339 / 344

页数：6

共 50 条

[1] Spoken Arabic Digits Recognition Using Discrete Wavelet
Elrgaby, Mohammed
Amoura, Abdwahab
Ganoun, Ali
[J]. 2014 UKSIM-AMSS 16TH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM), 2014, : 275 - 279
[2] Spoken Arabic Digits recognition Using MFCC based on GMM
Hammami, N.
Bedda, M.
Farah, N.
[J]. 2012 IEEE CONFERENCE ON SUSTAINABLE UTILIZATION AND DEVELOPMENT IN ENGINEERING AND TECHNOLOGY (STUDENT), 2012, : 160 - 163
[3] Investigating spoken Arabic digits in speech recognition setting
Alotaibi, YA
[J]. INFORMATION SCIENCES, 2005, 173 (1-3) : 115 - 139
[4] Performance Analysis of Spoken Arabic Digits Recognition Techniques
Ali Ganoun
Ibrahim Almerhag
[J]. Journal of Electronic Science and Technology, 2012, 10 (02) : 153 - 157
[5] Performance Analysis of Spoken Arabic Digits Recognition Techniques
Ali Ganoun
Ibrahim Almerhag
[J]. Journal of Electronic Science and Technology, 2012, (02) : 153 - 157
[6] Spoken Arabic Digits Recognition Based on Wavelet Neural Networks
Hu, Xiaohui
Zhan, Lvjun
Xue, Yun
Zhou, Weixing
Zhang, Liangjun
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 1481 - 1485
[7] Recognition of Arabic Accents From English Spoken Speech Using Deep Learning Approach
Habbash, Mansoor
Mnasri, Sami
Alghamdi, Mansoor
Alrashidi, Malek
Tarawneh, Ahmad S.
Gumair, Abdullah
Hassanat, Ahmad B.
[J]. IEEE ACCESS, 2024, 12 : 37219 - 37230
[8] Spoken Emotion Recognition Using Deep Learning
Albornoz, E. M.
Sanchez-Gutierrez, M.
Martinez-Licona, F.
Rufiner, H. L.
Goddard, J.
[J]. PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 104 - 111
[9] Expiry Date Digits Recognition using Deep Learning
Khan, Tareq
[J]. PROCEEDINGS OF THE 2019 IEEE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE (NAECON), 2019, : 302 - 304
[10] Spoken arabic digits recognizer using recurrent neural networks
Alotaibi, YA
[J]. PROCEEDINGS OF THE FOURTH IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2004, : 195 - 199

← 1 2 3 4 5 →