Deep Learning Bidirectional LSTM based Detection of Prolongation and Repetition in Stuttered Speech using Weighted MFCC

被引:0
|
作者
Gupta, Sakshi [1 ]
Shukla, Ravi S. [2 ]
Shukla, Rajesh K. [1 ]
Verma, Rajesh [3 ]
机构
[1] Invertis Univ, Dept Comp Sci & Engn, Bareilly, Uttar Pradesh, India
[2] Saudi Elect Univ, Dept Comp Sci, Riyadh, Saudi Arabia
[3] King Khalid Univ, Dept Elect Engn, Abha, Saudi Arabia
关键词
Speech; stuttering; deep learning; WMFCC; Bi-LSTM; CLASSIFICATION; DYSFLUENCIES; RECOGNITION;
D O I
10.14569/IJACSA.2020.0110941
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Stuttering is a neuro-development disorder during which normal speech flow is not fluent. Traditionally Speech-Language Pathologists used to assess the extent of stuttering by counting the speech disfluencies manually. Such sorts of stuttering assessments are arbitrary, incoherent, lengthy, and error-prone. The present study focused on objective assessment to speech disfluencies such as prolongation and syllable, word, and phrase repetition. The proposed method is based on the Weighted Mel Frequency Cepstral Coefficient feature extraction algorithm and deep-learning Bidirectional Long-Short term Memory neural network for classification of stuttered events. The work has utilized the UCLASS stuttering dataset for analysis. The speech samples of the database are initially preprocessed, manually segmented, and labeled as a type of disfluency. The labeled speech samples are parameterized to Weighted MFCC feature vectors. Then extracted features are inputted to the Bidirectional-LSTM network for training and testing of the model. The effect of different hyper-parameters on classification results is examined. The test results show that the proposed method reaches the best accuracy of 96.67%, as compared to the LSTM model. The promising recognition accuracy of 97.33%, 98.67%, 97.5%, 97.19%, and 97.67% was achieved for the detection of fluent, prolongation, syllable, word, and phrase repetition, respectively.
引用
收藏
页码:345 / 356
页数:12
相关论文
共 50 条
  • [1] Recognition of Repetition and Prolongation in Stuttered Speech Using ANN
    Savin, P. S.
    Ramteke, Pravin B.
    Koolagudi, Shashidhar G.
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, NETWORKING AND INFORMATICS (ICACNI 2015), VOL 1, 2016, 43 : 65 - 71
  • [2] A bidirectional LSTM deep learning approach for intrusion detection
    Imrana, Yakubu
    Xiang, Yanping
    Ali, Liaqat
    Abdul-Rauf, Zaharawu
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 185 (185)
  • [3] MFCC based Recognition of Repetitions and Prolongations in Stuttered Speech using k-NN and LDA
    Chee, Lim Sin
    Ai, Ooi Chia
    Hariharan, M.
    Yaacob, Sazali
    2009 IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT: SCORED 2009, PROCEEDINGS, 2009, : 146 - 149
  • [4] FluentNet: End-to-End Detection of Stuttered Speech Disfluencies With Deep Learning
    Kourkounakis, Tedd
    Hajavi, Amirhossein
    Etemad, Ali
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2986 - 2999
  • [5] DYSFLUENCY CLASSIFICATION IN STUTTERED SPEECH USING DEEP LEARNING FOR REAL-TIME APPLICATIONS
    Jouaiti, Melanie
    Dautenhahn, Kerstin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6482 - 6486
  • [6] MFCC-based Houston Toad Call Detection using LSTM
    Al Bashit, Abdullah
    Valles, Damian
    2019 IEEE INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR): ROBOTICS FOR THE BENEFIT OF HUMANITY, 2019,
  • [7] Deep Learning LSTM based Ransomware Detection
    Maniath, Sumith
    Ashok, Aravind
    Poornachandran, Prabaharan
    Sujadevi, V. G.
    Sankar, Prem A. U.
    Jan, Srinath
    2017 RECENT DEVELOPMENTS IN CONTROL, AUTOMATION AND POWER ENGINEERING (RDCAPE), 2017, : 442 - 446
  • [8] Development of vanilla LSTM based stuttered speech recognition system using bald eagle search algorithm
    S. Premalatha
    Vinit Kumar
    Naga Padmaja Jagini
    Gade Venkata Subba Reddy
    Signal, Image and Video Processing, 2023, 17 : 4077 - 4086
  • [9] Development of vanilla LSTM based stuttered speech recognition system using bald eagle search algorithm
    Premalatha, S.
    Kumar, Vinit
    Jagini, Naga Padmaja
    Reddy, Gade Venkata Subba
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (08) : 4077 - 4086
  • [10] An Investigation of Word Embeddings with Deep Bidirectional LSTM for Sentence Unit Detection in Automatic Speech Transcription
    Ho, Thi-Nga
    Duy-Cat Can
    Chng, Eng-Siong
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 139 - 142