Recognition and Classification of Pauses in Stuttered Speech using Acoustic Features

被引:0
|
作者
Afroz, Fathima [1 ]
Koolagudi, Shashidhar G. [2 ]
机构
[1] JSS Acad Tech Educ Bangalore, Dept Informat Sci & Engn, JSSATE B Campus Dr Vishnuvardan Rd, Bengaluru 560060, Karnataka, India
[2] NITK, Dept Comp Sci & Engn, NH 66, Mangaluru 575025, Karnataka, India
关键词
Terms Acoustic Features; Blind segmentation; Intermorphic pauses; Intra-morphic pauses; Stuttered Speech;
D O I
10.1109/spin.2019.8711569
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Pauses plays an essential role in speech activities. Normally it helps the listener by creating a time and space to decode and interpret the message of a speaker. But in case of stuttering pauses disturbs the normal flow of speech. The uncontrolled, frequent and unplanned occurance of pasuses leads to slow speaking rate, results in broken words and increases the severity level of stuttering. Hence pauses and stuttering has a close relationship. Pauses are considered as one of the important pattern in diagnoisis and treatment of stuttering. In this work, an attempt has been made for the identification of inaudible (Silent or Unfilled) pauses from stuttered speech. The attributes like duration, frequency, position and distribution of pauses during speech tasks are measured and quantified. UCLASS stuttered speech corpus is considered for the analysis. Automatic blind segmentation approach is adopted to segment the speech signal into voice and unvoiced regions using dynamic threshold set based on energy and zero crossing rate (ZCR). 4th formant frequencies are analysed to identify intra-morphic (unfilled) pauses present within voiced regions. The duratiion of intra-morphic pauses are analysed for stuttred speech and normal speech. It is observed that the duration of normal intramorphic pause ranges from 150 ms-250 ms and inter-morphic pauses are <=250 ms and short pause have duration ranges from 50 ms-150 ms. Whereas in stuttering short intra-morphic pauses ranges from 10 ms to 50 ms, long pauses ranges from 250 ms to 1 or 2 seconds. Segmentation of the intra-morphic pauses is observed to acheive an accuracy of 98%. Results are compared and validated with manual method.
引用
收藏
页码:921 / 926
页数:6
相关论文
共 50 条
  • [31] Deep fusion framework for speech command recognition using acoustic and linguistic features
    Sunakshi Mehra
    Seba Susan
    Multimedia Tools and Applications, 2023, 82 : 38667 - 38691
  • [32] Deep fusion framework for speech command recognition using acoustic and linguistic features
    Mehra, Sunakshi
    Susan, Seba
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 38667 - 38691
  • [33] Emotion Classification in Children's Speech Using Fusion of Acoustic and Linguistic Features
    Polzehl, Tim
    Sundaram, Shiva
    Ketabdar, Hamed
    Wagner, Michael
    Metze, Florian
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 340 - +
  • [34] DYSFLUENCY CLASSIFICATION IN STUTTERED SPEECH USING DEEP LEARNING FOR REAL-TIME APPLICATIONS
    Jouaiti, Melanie
    Dautenhahn, Kerstin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6482 - 6486
  • [35] Gradient-Based Acoustic Features for Speech Recognition
    Muroi, Takashi
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2009), 2009, : 445 - 448
  • [36] Compression of acoustic features for speech recognition in network environments
    Ramaswamy, GN
    Gopalakrishnan, PS
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 977 - 980
  • [37] Adaptive Optimization Based Neural Network for Classification of Stuttered Speech
    Manjula, G.
    Shivakumar, M.
    Geetha, Y. V.
    PROCEEDINGS OF 2019 THE 3RD INTERNATIONAL CONFERENCE ON CRYPTOGRAPHY, SECURITY AND PRIVACY (ICCSP 2019) WITH WORKSHOP 2019 THE 4TH INTERNATIONAL CONFERENCE ON MULTIMEDIA AND IMAGE PROCESSING (ICMIP 2019), 2019, : 93 - 98
  • [38] Emotion recognition from speech using deep recurrent neural networks with acoustic features
    Byun, Sung-Woo
    Shin, Bo-Ra
    Lee, Seok-Pil
    Han, Hyuk-Soo
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2018, 123 : 43 - 44
  • [39] Performance Estimation of Spontaneous Speech Recognition Using Non-Reference Acoustic Features
    Guo, Ling
    Yamada, Takeshi
    Makino, Shoji
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [40] Noisy speech recognition using de-noised multiresolution analysis acoustic features
    Chan, CP
    Ching, PC
    Lee, T
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2001, 110 (05): : 2567 - 2574