Recognition and Classification of Pauses in Stuttered Speech using Acoustic Features

被引:0
|
作者
Afroz, Fathima [1 ]
Koolagudi, Shashidhar G. [2 ]
机构
[1] JSS Acad Tech Educ Bangalore, Dept Informat Sci & Engn, JSSATE B Campus Dr Vishnuvardan Rd, Bengaluru 560060, Karnataka, India
[2] NITK, Dept Comp Sci & Engn, NH 66, Mangaluru 575025, Karnataka, India
关键词
Terms Acoustic Features; Blind segmentation; Intermorphic pauses; Intra-morphic pauses; Stuttered Speech;
D O I
10.1109/spin.2019.8711569
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Pauses plays an essential role in speech activities. Normally it helps the listener by creating a time and space to decode and interpret the message of a speaker. But in case of stuttering pauses disturbs the normal flow of speech. The uncontrolled, frequent and unplanned occurance of pasuses leads to slow speaking rate, results in broken words and increases the severity level of stuttering. Hence pauses and stuttering has a close relationship. Pauses are considered as one of the important pattern in diagnoisis and treatment of stuttering. In this work, an attempt has been made for the identification of inaudible (Silent or Unfilled) pauses from stuttered speech. The attributes like duration, frequency, position and distribution of pauses during speech tasks are measured and quantified. UCLASS stuttered speech corpus is considered for the analysis. Automatic blind segmentation approach is adopted to segment the speech signal into voice and unvoiced regions using dynamic threshold set based on energy and zero crossing rate (ZCR). 4th formant frequencies are analysed to identify intra-morphic (unfilled) pauses present within voiced regions. The duratiion of intra-morphic pauses are analysed for stuttred speech and normal speech. It is observed that the duration of normal intramorphic pause ranges from 150 ms-250 ms and inter-morphic pauses are <=250 ms and short pause have duration ranges from 50 ms-150 ms. Whereas in stuttering short intra-morphic pauses ranges from 10 ms to 50 ms, long pauses ranges from 250 ms to 1 or 2 seconds. Segmentation of the intra-morphic pauses is observed to acheive an accuracy of 98%. Results are compared and validated with manual method.
引用
收藏
页码:921 / 926
页数:6
相关论文
共 50 条
  • [41] An optimal two stage feature selection for speech emotion recognition using acoustic features
    Kuchibhotla S.
    Vankayalapati H.D.
    Anne K.R.
    International Journal of Speech Technology, 2016, 19 (4) : 657 - 667
  • [42] Positive and Negative Emotions Recognition from Speech Signal Using Acoustic and Lexical Features
    Kurniawati, Pipin
    Lestari, Dessi Puji
    2017 4TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS, CONCEPTS, THEORY, AND APPLICATIONS (ICAICTA) PROCEEDINGS, 2017,
  • [43] Noisy speech recognition using de-noised multiresolution analysis acoustic features
    Chan, C.P.
    Ching, P.C.
    Lee, T.
    1600, Acoustical Society of America (110):
  • [44] Automatic Emotion Recognition in Compressed Speech Using Acoustic and Non-Linear Features
    Garcia, N.
    Vasquez-Correa, J. C.
    Arias-Londono, J. D.
    Vargas-Bonilla, J. F.
    Orozco-Arroyave, J. R.
    2015 20TH SYMPOSIUM ON SIGNAL PROCESSING, IMAGES AND COMPUTER VISION (STSIVA), 2015,
  • [45] SPEECH EMOTION RECOGNITION FROM INDONESIAN SPOKEN LANGUAGE USING ACOUSTIC AND LEXICAL FEATURES
    Kurniawati, Pipin
    Lestari, Dessi Puji
    Khodra, Masayu Leylia
    2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA), 2017, : 189 - 195
  • [46] Analysis and classification of speech sounds of children with autism spectrum disorder using acoustic features
    Mohanta, Abhijit
    Mittal, Vinay Kumar
    COMPUTER SPEECH AND LANGUAGE, 2022, 72
  • [47] Classification of Fricatives Using Feature Extrapolation of Acoustic-Phonetic Features in Telephone Speech
    Lee, Jung-Won
    Choi, Jeung-Yoon
    Kang, Hong-Goo
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1268 - 1271
  • [48] Modeling Pauses for Synthesis of Storytelling Style Speech using Unsupervised Word Features
    Sarkar, Parakrant
    Rao, K. Sreenivasa
    SECOND INTERNATIONAL SYMPOSIUM ON COMPUTER VISION AND THE INTERNET (VISIONNET'15), 2015, 58 : 42 - 49
  • [49] Tree-based Context Clustering Using Speech Recognition Features for Acoustic Model Training of Speech Synthesis
    Chanjaradwichai, Supadaech
    Suchato, Atiwong
    Punyabukkana, Proadpran
    2015 12TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2015,
  • [50] Structure of pauses in speech in the context of speaker verification and classification of speech type
    Magdalena Igras-Cybulska
    Bartosz Ziółko
    Piotr Żelasko
    Marcin Witkowski
    EURASIP Journal on Audio, Speech, and Music Processing, 2016