Filled Pauses and Lengthenings Detection Based on the Acoustic Features for the Spontaneous Russian Speech

被引：0

作者：

Verkhodanova, Vasilisa ^{[1
]}

Shapranov, Vladimir ^{[2
]}

机构：

[1] SPIIRAS, 39 14th Line, St Petersburg 199178, Russia

[2] Betria Syst Inc, St Petersburg, Russia

来源：

SPEECH AND COMPUTER | 2014年 / 8773卷

关键词：

speech disfluencies; filled pauses; lengthenings; hesitation; speech corpus; spontaneous speech processing; speech recognition; RECOGNITION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The spontaneous speech processing has a number of problems. Among them there are speech disfluencies. Although most of them are easily treated by speakers and usually do not cause any difficulties for understanding, for Automatic Speech Recognition (ASR) systems their appearance lead to many recognition mistakes. Our paper deals with the most frequent of them (filled pauses and sound lengthenings) basing on the analysis of their acoustical parameters. The method based on the autocorrelation function was used to detect voiced hesitation phenomena and a method of band-filtering was used to detect unvoiced hesitation phenomena. For the experiments on filled pauses and lengthenings detection an especially collected corpus of spontaneous Russian map-task and appointment-task dialogs was used. The accuracy of voiced filled pauses and lengthening detection was 80%. And accuracy of detection of unvoiced fricative lengthening was 66%.

引用

页码：227 / 234

页数：8

共 50 条

[21] Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence
Adell, Jordi
Escudero, David
Bonafonte, Antonio
SPEECH COMMUNICATION, 2012, 54 (03) : 459 - 476
[22] Experiments on Detection of Voiced Hesitations in Russian Spontaneous Speech
Verkhodanova, Vasilisa
Shapranov, Vladimir
JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2016, 2016
[23] Incorporating Acoustic Features for Spontaneous Speech driven Content Retrieval
Tasaki, Hiroto
Akiba, Tomoyosi
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2894 - 2898
[24] A novel detection method of filled pause in mandarin spontaneous speech
Li, Yan-Xiong
He, Qian-Hua
Li, Tao
7TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE IN CONJUNCTION WITH 2ND IEEE/ACIS INTERNATIONAL WORKSHOP ON E-ACTIVITY, PROCEEDINGS, 2008, : 217 - 222
[25] Resolving Ambiguities in Sentence Boundary Detection in Russian Spontaneous Speech
Stepikhov, Anton
TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 426 - 433
[26] Temporal Integration of Text Transcripts and Acoustic Features for Alzheimer's Diagnosis Based on Spontaneous Speech
Martinc, Matej
Haider, Fasih
Pollak, Senja
Luz, Saturnino
FRONTIERS IN AGING NEUROSCIENCE, 2021, 13
[27] Fusion of Acoustic and Linguistic Speech Features for Emotion Detection
Metze, Florian
Polzehl, Tim
Wagner, Michael
2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 153 - +
[28] Long Range Acoustic Features for Spoofed Speech Detection
Das, Rohan Kumar
Yang, Jichen
Li, Haizhou
INTERSPEECH 2019, 2019, : 1058 - 1062
[29] Reusing data during speech pauses in an NLMS-based acoustic echo canceller
Lindstrom, Fredric
Schuldt, Christian
Claesson, Ingvar
Konftel, A. B.
TENCON 2006 - 2006 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2006, : 335 - +
[30] Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features
Truong, Khict P.
Raaijmakers, Stephan
MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 161 - +

← 1 2 3 4 5 →