Amharic spoken digits recognition using convolutional neural network

被引:0
|
作者
Ayall, Tewodros Alemu [1 ,4 ]
Zhou, Changjun [1 ]
Liu, Huawen [2 ]
Brhanemeskel, Getnet Mezgebu [3 ]
Abate, Solomon Teferra [3 ]
Adjeisah, Michael [1 ]
机构
[1] Zhejiang Normal Univ, Sch Comp Sci & Technol, Jinhua, Peoples R China
[2] Shaoxing Univ, Dept Comp Sci, Shaoxing, Peoples R China
[3] Addis Ababa Univ, Sch Informat Sci, Addis Ababa, Ethiopia
[4] Univ Aberdeen, Interdisciplinary Ctr Data & AI, Sch Nat & Comp Sci, Aberdeen AB24 3UE, Scotland
关键词
Automatic speech recognition; Spoken digit recognition; Amharic spoken digits recognition; Convolutional neural network; Speech feature extraction; SPEECH RECOGNITION;
D O I
10.1186/s40537-024-00910-z
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Spoken digits recognition (SDR) is a type of supervised automatic speech recognition, which is required in various human-machine interaction applications. It is utilized in phone-based services like dialing systems, certain bank operations, airline reservation systems, and price extraction. However, the design of SDR is a challenging task that requires the development of labeled audio data, the proper choice of feature extraction method, and the development of the best performing model. Even if several works have been done for various languages, such as English, Arabic, Urdu, etc., there is no developed Amharic spoken digits dataset (AmSDD) to build Amharic spoken digits recognition (AmSDR) model for the Amharic language, which is the official working language of the government of Ethiopia. Therefore, in this study, we developed a new AmSDD that contains 12,000 utterances of 0 (Zaero) to 9 (zet'enyi) digits which were recorded from 120 volunteer speakers of different age groups, genders, and dialects who repeated each digit ten times. Mel frequency cepstral coefficients (MFCCs) and Mel-Spectrogram feature extraction methods were used to extract trainable features from the speech signal. We conducted different experiments on the development of the AmSDR model using the AmSDD and classical supervised learning algorithms such as Linear Discriminant Analysis (LDA), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest (RF) as the baseline. To further improve the performance recognition of AmSDR, we propose a three layers Convolutional Neural Network (CNN) architecture with Batch normalization. The results of our experiments show that the proposed CNN model outperforms the baseline algorithms and scores an accuracy of 99% and 98% using MFCCs and Mel-Spectrogram features, respectively.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Calf Posture Recognition Using Convolutional Neural Network
    Tan Chen Tung
    Khairuddin, Uswah
    Shapiai, Mohd Ibrahim
    Nor, Norhariani Md
    Hiew, Mark Wen Han
    Suhaimie, Nurul Aisyah Mohd
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 1493 - 1508
  • [42] PestDetect: Pest Recognition Using Convolutional Neural Network
    Murcia Labana, Federico
    Ruiz, Alberto
    Garcia-Sanchez, Francisco
    [J]. ICT FOR AGRICULTURE AND ENVIRONMENT, 2019, 901 : 99 - 108
  • [43] Food Detection and Recognition Using Convolutional Neural Network
    Kagaya, Hokuto
    Aizawa, Kiyoharu
    Ogawa, Makoto
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 1085 - 1088
  • [44] Gesture Recognition and Localization Using Convolutional Neural Network
    Wang, Fei
    Kong, Li
    Zhang, Xing
    Chen, Hu
    [J]. PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5984 - 5989
  • [45] Optical Character Recognition using Convolutional Neural Network
    Shreya, Sakshi
    Upadhyay, Yash
    Manchanda, Mohit
    Vohra, Rubeena
    Singh, Gagan Deep
    [J]. PROCEEDINGS OF THE 2019 6TH INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT (INDIACOM), 2019, : 55 - 59
  • [46] Object Recognition in Images using Convolutional Neural Network
    Duth, Sudharshan P.
    Raj, Swathi
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INVENTIVE SYSTEMS AND CONTROL (ICISC 2018), 2018, : 718 - 722
  • [47] Bird Sound Recognition Using a Convolutional Neural Network
    Incze, Agnes
    Jancso, Henrietta-Bernadett
    Szilagyi, Zoltan
    Farkas, Attila
    Sulyok, Csaba
    [J]. 2018 IEEE 16TH INTERNATIONAL SYMPOSIUM ON INTELLIGENT SYSTEMS AND INFORMATICS (SISY 2018), 2018, : 295 - 300
  • [48] Modulation scheme recognition using convolutional neural network
    Zhang, Qianwen
    Xu, Zhan
    Zhang, Peiyue
    [J]. JOURNAL OF ENGINEERING-JOE, 2019, 2019 (23): : 9075 - 9078
  • [49] Facial Expression Recognition Using Convolutional Neural Network
    Gan, Yijun
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2018), 2018,
  • [50] FACIAL FRECKLES RECOGNITION USING CONVOLUTIONAL NEURAL NETWORK
    Hu, Liang
    Chen, Li
    Tian, Jing
    [J]. 2017 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2017), 2017, : 145 - 148