Filled Pauses and Lengthenings Detection Based on the Acoustic Features for the Spontaneous Russian Speech

被引:0
|
作者
Verkhodanova, Vasilisa [1 ]
Shapranov, Vladimir [2 ]
机构
[1] SPIIRAS, 39 14th Line, St Petersburg 199178, Russia
[2] Betria Syst Inc, St Petersburg, Russia
来源
SPEECH AND COMPUTER | 2014年 / 8773卷
关键词
speech disfluencies; filled pauses; lengthenings; hesitation; speech corpus; spontaneous speech processing; speech recognition; RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spontaneous speech processing has a number of problems. Among them there are speech disfluencies. Although most of them are easily treated by speakers and usually do not cause any difficulties for understanding, for Automatic Speech Recognition (ASR) systems their appearance lead to many recognition mistakes. Our paper deals with the most frequent of them (filled pauses and sound lengthenings) basing on the analysis of their acoustical parameters. The method based on the autocorrelation function was used to detect voiced hesitation phenomena and a method of band-filtering was used to detect unvoiced hesitation phenomena. For the experiments on filled pauses and lengthenings detection an especially collected corpus of spontaneous Russian map-task and appointment-task dialogs was used. The accuracy of voiced filled pauses and lengthening detection was 80%. And accuracy of detection of unvoiced fricative lengthening was 66%.
引用
收藏
页码:227 / 234
页数:8
相关论文
共 50 条
  • [21] Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence
    Adell, Jordi
    Escudero, David
    Bonafonte, Antonio
    SPEECH COMMUNICATION, 2012, 54 (03) : 459 - 476
  • [22] Experiments on Detection of Voiced Hesitations in Russian Spontaneous Speech
    Verkhodanova, Vasilisa
    Shapranov, Vladimir
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2016, 2016
  • [23] Incorporating Acoustic Features for Spontaneous Speech driven Content Retrieval
    Tasaki, Hiroto
    Akiba, Tomoyosi
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2894 - 2898
  • [24] A novel detection method of filled pause in mandarin spontaneous speech
    Li, Yan-Xiong
    He, Qian-Hua
    Li, Tao
    7TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE IN CONJUNCTION WITH 2ND IEEE/ACIS INTERNATIONAL WORKSHOP ON E-ACTIVITY, PROCEEDINGS, 2008, : 217 - 222
  • [25] Resolving Ambiguities in Sentence Boundary Detection in Russian Spontaneous Speech
    Stepikhov, Anton
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 426 - 433
  • [26] Temporal Integration of Text Transcripts and Acoustic Features for Alzheimer's Diagnosis Based on Spontaneous Speech
    Martinc, Matej
    Haider, Fasih
    Pollak, Senja
    Luz, Saturnino
    FRONTIERS IN AGING NEUROSCIENCE, 2021, 13
  • [27] Fusion of Acoustic and Linguistic Speech Features for Emotion Detection
    Metze, Florian
    Polzehl, Tim
    Wagner, Michael
    2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 153 - +
  • [28] Long Range Acoustic Features for Spoofed Speech Detection
    Das, Rohan Kumar
    Yang, Jichen
    Li, Haizhou
    INTERSPEECH 2019, 2019, : 1058 - 1062
  • [29] Reusing data during speech pauses in an NLMS-based acoustic echo canceller
    Lindstrom, Fredric
    Schuldt, Christian
    Claesson, Ingvar
    Konftel, A. B.
    TENCON 2006 - 2006 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2006, : 335 - +
  • [30] Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features
    Truong, Khict P.
    Raaijmakers, Stephan
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 161 - +