Filled Pauses and Lengthenings Detection Based on the Acoustic Features for the Spontaneous Russian Speech

被引:0
|
作者
Verkhodanova, Vasilisa [1 ]
Shapranov, Vladimir [2 ]
机构
[1] SPIIRAS, 39 14th Line, St Petersburg 199178, Russia
[2] Betria Syst Inc, St Petersburg, Russia
来源
SPEECH AND COMPUTER | 2014年 / 8773卷
关键词
speech disfluencies; filled pauses; lengthenings; hesitation; speech corpus; spontaneous speech processing; speech recognition; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spontaneous speech processing has a number of problems. Among them there are speech disfluencies. Although most of them are easily treated by speakers and usually do not cause any difficulties for understanding, for Automatic Speech Recognition (ASR) systems their appearance lead to many recognition mistakes. Our paper deals with the most frequent of them (filled pauses and sound lengthenings) basing on the analysis of their acoustical parameters. The method based on the autocorrelation function was used to detect voiced hesitation phenomena and a method of band-filtering was used to detect unvoiced hesitation phenomena. For the experiments on filled pauses and lengthenings detection an especially collected corpus of spontaneous Russian map-task and appointment-task dialogs was used. The accuracy of voiced filled pauses and lengthening detection was 80%. And accuracy of detection of unvoiced fricative lengthening was 66%.
引用
收藏
页码:227 / 234
页数:8
相关论文
共 50 条
  • [1] Multi-factor Method for Detection of Filled Pauses and Lengthenings in Russian Spontaneous Speech
    Verkhodanova, Vasilisa
    Shapranov, Vladimir
    SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 285 - 292
  • [2] Detecting Filled Pauses and Lengthenings in Russian Spontaneous Speech Using SVM
    Verkhodanova, Vasilisa
    Shapranov, Vladimir
    SPEECH AND COMPUTER, 2016, 9811 : 224 - 231
  • [3] Acoustic feature analysis and discriminative modeling of filled pauses for spontaneous speech recognition
    Wu, CH
    Yan, GL
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2004, 36 (2-3): : 91 - 104
  • [4] Acoustic Feature Analysis and Discriminative Modeling of Filled Pauses for Spontaneous Speech Recognition
    Chung-Hsien Wu
    Gwo-Lang Yan
    Journal of VLSI signal processing systems for signal, image and video technology, 2004, 36 : 91 - 104
  • [5] Automatic identification of filled pauses in spontaneous speech
    O'Shaughnessy, D
    Gabrea, M
    2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 620 - 624
  • [6] Filled pauses in multilingual speech: an acoustic analysis
    Spreafico, Lorenzo
    LINGUISTICA E FILOLOGIA, 2016, (36): : 99 - 116
  • [7] Modeling filled pauses for spontaneous speech recognition applications
    Zgank, Andrej
    Rotovnik, Tomaz
    Maucec, Mirjam Sepesy
    AEE' 08: PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON APPLICATION OF ELECTRICAL ENGINEERING, 2008, : 42 - +
  • [8] Occurrences and Durations of Filled Pauses in Relation to Words and Silent Pauses in Spontaneous Speech
    Gosy, Maria
    LANGUAGES, 2023, 8 (01)
  • [9] Entrainment in spontaneous speech: the case of filled pauses in Supreme Court hearings
    Benus, Stefan
    Levitan, Rivka
    Hirschberg, Julia
    3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 793 - 797
  • [10] THE USE OF ACOUSTICALLY DETECTED FILLED AND SILENT PAUSES IN SPONTANEOUS SPEECH RECOGNITION
    Ogata, Jun
    Goto, Masataka
    Itou, Katunobu
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4305 - +