Efficient feature extraction and classification for the development of Pashto speech recognition system

被引:0
|
作者
Irfan Ahmed
Muhammad Abeer Irfan
Abid Iqbal
Amaad Khalil
Salman Ilahi Siddiqui
机构
[1] University of Engineering and Technology Peshawar,Department of Electrical Engineering
[2] Jalozai Campus,Department of Computer Systems Engineering
[3] University of Engineering and Technology Peshawar,undefined
来源
关键词
Automatic speech recognition (ASR); Machine learning (ML); Feature extraction; MFCC; DWT; SVM; -NN;
D O I
暂无
中图分类号
学科分类号
摘要
In this work, a novel framework for the efficient feature extraction and recognition of Pashto speech signals is proposed. The targeted language is one of the low-resource languages and prone to higher Automatic Speech Recognition (ASR) errors due to the availability of its colloquial dialects. We devised a framework which not only employed classical Machine Learning (ML) models for speech recognition tasks, but also achieved a higher level of performance accuracy by using the optimal feature extraction techniques. The designed frameworks for feature extraction are based on two well-know feature extraction techniques: Discrete Wavelet Transform (DWT )coefficients and Mel-Frequency Cepstral Coefficients (MFCC). In our work, we deployed classical ML models i.e., Support Vector Machine (SVM) and K-Nearest Neighbors (k-NN), due to their efficiency in terms of computation complexity, energy efficiency, and higher accuracy as compared to other ML and Deep Learning (DL) model. Hence, our proposed framework exhibited improved performance level when trained on a Pashto isolated words dataset.
引用
收藏
页码:54081 / 54096
页数:15
相关论文
共 50 条
  • [21] Denoising Speech for MFCC Feature Extraction Using Wavelet Transformation in Speech Recognition System
    Hidayat, Risanuri
    Bejo, Agus
    Sumaryono, Sujoko
    Winursito, Anggun
    PROCEEDINGS OF 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING (ICITEE), 2018, : 280 - 284
  • [22] Comparison of Feature Extraction for Accent Dependent Thai Speech Recognition System
    Tantisatirapong, Suchada
    Prasoproek, Chalisa
    Phothisonothai, Montri
    2018 IEEE SEVENTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (IEEE ICCE 2018), 2018, : 322 - 325
  • [23] Proposed combination of PCA and MFCC feature extraction in speech recognition system
    Hoang Trang
    Tran Hoang Loc
    Huynh Bui Hoang Nam
    2014 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC), 2014, : 697 - 702
  • [24] PASHTO SPEECH RECOGNITION WITH LIMITED PRONUNCIATION LEXICON
    Prasad, Rohit
    Tsakalidis, Stavros
    Bulyko, Ivan
    Kao, Chia-lin
    Natarajan, Prem
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5086 - 5089
  • [25] Face Recognition by Feature Extraction and Classification
    Chen, Xinzheng
    Song, Lihong
    Qiu, Chaochao
    PROCEEDINGS OF 2018 12TH IEEE INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2018, : 43 - 46
  • [26] Composite Feature Extraction for Speech Emotion Recognition
    Fu, Yangzhi
    Yuan, Xiaochen
    2020 IEEE 23RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2020), 2020, : 72 - 77
  • [27] Reduced Feature Extraction for Emotional Speech Recognition
    Palo, Hemanta Kumar
    Mohanty, Mihir Narayan
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [28] Geometrical feature extraction for robust speech recognition
    Li, Xiaokun
    Kwan, Chiman
    2005 39TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2005, : 558 - 562
  • [29] Sparse KPCA for feature extraction in speech recognition
    Lima, A
    Zen, H
    Nankaku, Y
    Tokuda, K
    Kitamura, T
    Resende, FG
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 353 - 356
  • [30] The application of optimization in feature extraction of speech recognition
    Gu, L
    Liu, RS
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 745 - 748