Deep Learning Bidirectional LSTM based Detection of Prolongation and Repetition in Stuttered Speech using Weighted MFCC

被引:0
|
作者
Gupta, Sakshi [1 ]
Shukla, Ravi S. [2 ]
Shukla, Rajesh K. [1 ]
Verma, Rajesh [3 ]
机构
[1] Invertis Univ, Dept Comp Sci & Engn, Bareilly, Uttar Pradesh, India
[2] Saudi Elect Univ, Dept Comp Sci, Riyadh, Saudi Arabia
[3] King Khalid Univ, Dept Elect Engn, Abha, Saudi Arabia
关键词
Speech; stuttering; deep learning; WMFCC; Bi-LSTM; CLASSIFICATION; DYSFLUENCIES; RECOGNITION;
D O I
10.14569/IJACSA.2020.0110941
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stuttering is a neuro-development disorder during which normal speech flow is not fluent. Traditionally Speech-Language Pathologists used to assess the extent of stuttering by counting the speech disfluencies manually. Such sorts of stuttering assessments are arbitrary, incoherent, lengthy, and error-prone. The present study focused on objective assessment to speech disfluencies such as prolongation and syllable, word, and phrase repetition. The proposed method is based on the Weighted Mel Frequency Cepstral Coefficient feature extraction algorithm and deep-learning Bidirectional Long-Short term Memory neural network for classification of stuttered events. The work has utilized the UCLASS stuttering dataset for analysis. The speech samples of the database are initially preprocessed, manually segmented, and labeled as a type of disfluency. The labeled speech samples are parameterized to Weighted MFCC feature vectors. Then extracted features are inputted to the Bidirectional-LSTM network for training and testing of the model. The effect of different hyper-parameters on classification results is examined. The test results show that the proposed method reaches the best accuracy of 96.67%, as compared to the LSTM model. The promising recognition accuracy of 97.33%, 98.67%, 97.5%, 97.19%, and 97.67% was achieved for the detection of fluent, prolongation, syllable, word, and phrase repetition, respectively.
引用
收藏
页码:345 / 356
页数:12
相关论文
共 50 条
  • [21] Deep Learning-Based Modified Bidirectional LSTM Network for Classification of ADHD Disorder
    Sudhanshu Saurabh
    P. K. Gupta
    Arabian Journal for Science and Engineering, 2024, 49 : 3009 - 3026
  • [22] Deep Bidirectional LSTM Network Learning-Based Sentiment Analysis for Arabic Text
    Elfaik, Hanane
    Nfaoui, El Habib
    JOURNAL OF INTELLIGENT SYSTEMS, 2021, 30 (01) : 395 - 412
  • [23] Attention Bidirectional LSTM Networks Based Mime Speech Recognition Using sEMG Data
    Ye, Hongyi
    Lin, Haohong
    Song, Zijun
    Zhang, Ming
    Hu, Ruifen
    Li, Nan
    Li, Guang
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3162 - 3167
  • [24] Live Streaming Speech Recognition Using Deep Bidirectional LSTM Acoustic Models and Interpolated Language Models
    Jorge, Javier
    Gimenez, Adria
    Silvestre-Cerda, Joan Albert
    Civera, Jorge
    Sanchis, Albert
    Juan, Alfons
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 148 - 161
  • [25] Speech Emotion Detection using IoT based Deep Learning for Health Care
    Tariq, Zeenat
    Shah, Sayed Khushal
    Lee, Yugyung
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 4191 - 4196
  • [26] Indonesia Hate Speech Detection using Deep Learning
    Sutejo, Taufic Leonardo
    Lestari, Dessi Puji
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 39 - 43
  • [27] POLYPHONIC SOUND EVENT DETECTION USING CONVOLUTIONAL BIDIRECTIONAL LSTM AND SYNTHETIC DATA-BASED TRANSFER LEARNING
    Jung, Seokwon
    Park, Jungbae
    Lee, Sangwan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 885 - 889
  • [28] A deep learning model for depression detection based on MFCC and CNN generated spectrogram features
    Das, Arnab Kumar
    Naskar, Ruchira
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 90
  • [29] Amazigh Spoken Digit Recognition using a Deep Learning Approach based on MFCC
    Boulal, Hossam
    Hamidi, Mohamed
    Abarkan, Mustapha
    Barkani, Jamal
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (07) : 791 - 798
  • [30] LSTM-based deep learning application in brain tumor detection using MR spectroscopy
    Altun, Sinan
    Alkan, Ahmet
    JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2023, 38 (02): : 1193 - 1202