Filled Pauses and Lengthenings Detection Based on the Acoustic Features for the Spontaneous Russian Speech

被引:0
|
作者
Verkhodanova, Vasilisa [1 ]
Shapranov, Vladimir [2 ]
机构
[1] SPIIRAS, 39 14th Line, St Petersburg 199178, Russia
[2] Betria Syst Inc, St Petersburg, Russia
来源
SPEECH AND COMPUTER | 2014年 / 8773卷
关键词
speech disfluencies; filled pauses; lengthenings; hesitation; speech corpus; spontaneous speech processing; speech recognition; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spontaneous speech processing has a number of problems. Among them there are speech disfluencies. Although most of them are easily treated by speakers and usually do not cause any difficulties for understanding, for Automatic Speech Recognition (ASR) systems their appearance lead to many recognition mistakes. Our paper deals with the most frequent of them (filled pauses and sound lengthenings) basing on the analysis of their acoustical parameters. The method based on the autocorrelation function was used to detect voiced hesitation phenomena and a method of band-filtering was used to detect unvoiced hesitation phenomena. For the experiments on filled pauses and lengthenings detection an especially collected corpus of spontaneous Russian map-task and appointment-task dialogs was used. The accuracy of voiced filled pauses and lengthening detection was 80%. And accuracy of detection of unvoiced fricative lengthening was 66%.
引用
收藏
页码:227 / 234
页数:8
相关论文
共 50 条
  • [31] Acoustic Features for Classification Based Speech Separation
    Wang, Yuxuan
    Han, Kun
    Wang, DeLiang
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1530 - 1533
  • [32] Speech Recognition with Word Fragment Detection Using Prosody Features for Spontaneous Speech
    Yeh, Jui-Feng
    Yen, Ming-Chi
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2012, 6 (02): : 669S - 675S
  • [33] Detection of Speech Embedded in Real Acoustic Background Based on Amplitude Modulation Spectrogram Features
    Anemueller, Joern
    Schmidt, Denny
    Bach, Joerg-Hendrik
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2582 - 2585
  • [34] DETECTION OF COPD EXACERBATION FROM SPEECH: COMPARISON OF ACOUSTIC FEATURES AND DEEP LEARNING BASED SPEECH BREATHING MODELS
    Nallanthighal, Venkata Srikanth
    Harma, Aki
    Strik, Helmer
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 9097 - 9101
  • [35] Combining Syntactic and Acoustic Features for Prosodic Boundary Detection in Russian
    Kocharov, Daniil
    Kachkovskaia, Tatiana
    Mirzagitova, Aliya
    Skrelin, Pavel
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2016, 2016, 9918 : 68 - 79
  • [36] Acoustic Features Characterization of Autism Speech for Automated Detection and Classification
    Mohanta, Abhijit
    Mukherjee, Prerana
    Mirtal, Vinay Kumar
    2020 TWENTY SIXTH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC 2020), 2020,
  • [37] Accuracy of perceptual and acoustic methods for the detection of inspiratory loci in spontaneous speech
    Wang, Yu-Tsai
    Nip, Ignatius S. B.
    Green, Jordan R.
    Kent, Ray D.
    Kent, Jane Finley
    Ullman, Cara
    BEHAVIOR RESEARCH METHODS, 2012, 44 (04) : 1121 - 1128
  • [38] Accuracy of perceptual and acoustic methods for the detection of inspiratory loci in spontaneous speech
    Yu-Tsai Wang
    Ignatius S. B. Nip
    Jordan R. Green
    Ray D. Kent
    Jane Finley Kent
    Cara Ullman
    Behavior Research Methods, 2012, 44 : 1121 - 1128
  • [39] Gradient-Based Acoustic Features for Speech Recognition
    Muroi, Takashi
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2009), 2009, : 445 - 448
  • [40] Classifying clear and conversational speech based on acoustic features
    Amano-Kusumoto, Akiko
    Hosom, John-Paul
    Shafran, Izhak
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1699 - 1702