Improving the accuracy of the speech synthesis based phonetic alignment using multiple acoustic features

被引：0

作者：

Paulo, S ^{[1
]}

Oliveira, LC ^{[1
]}

机构：

[1] IST, INESC ID, Spoken Language Syst Lab, P-1000029 Lisbon, Portugal

来源：

COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANAGUAGE, PROCEEDINGS | 2003年 / 2721卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The phonetic alignment of the spoken utterances for speech research are commonly performed by HMM-based speech recognizers, in forced alignment mode, but the training of the phonetic segment models requires considerable amounts of annotated data. When no such material is available, a possible solution is to synthesize the same phonetic sequence and align the resulting speech signal with the spoken utterances. However, without a careful choice of acoustic features used in this procedure, it can perform poorly when applied to continuous speech utterances. In this paper we propose a new method to select the best features to use in the alignment procedure for each pair of phonetic segment classes. The results show that this selection considerably reduces the segment boundary location errors.

引用

页码：31 / 39

页数：9

共 50 条

[1] Phonetic segmentation using multiple speech features
Mporas, Iosif
Ganchev, Todor
Fakotakis, Nikos
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2008, 11 (02) : 73 - 85
[2] PHONETIC FEATURES AND ACOUSTIC INVARIANCE IN SPEECH
BLUMSTEIN, SE
STEVENS, KN
[J]. COGNITION, 1981, 10 (1-3) : 25 - 32
[3] IMPROVING SPEECH ENHANCEMENT WITH PHONETIC EMBEDDING FEATURES
Wu, Bo
Yu, Meng
Chen, Lianwu
Jin, Mingjie
Su, Dan
Yu, Dong
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 645 - 651
[4] Phonetic alignment:: speech synthesis-based vs. Viterbi-based
Malfrère, F
Deroo, O
Dutoit, T
Ris, C
[J]. SPEECH COMMUNICATION, 2003, 40 (04) : 503 - 515
[5] Acoustic-Phonetic Features for Refining the Explicit Speech Segmentation
Selmini, Antonio Marcos
Violaro, Fabio
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1853 - 1856
[6] Improving accuracy of multiple sequence alignment algorithms based on alignment of neighboring residues
Lu, Yue
Sze, Sing-Hoi
[J]. NUCLEIC ACIDS RESEARCH, 2009, 37 (02) : 463 - 472
[7] Classification of Fricatives Using Feature Extrapolation of Acoustic-Phonetic Features in Telephone Speech
Lee, Jung-Won
Choi, Jeung-Yoon
Kang, Hong-Goo
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1268 - 1271
[8] Speech intelligibility of dysarthric speech: human scores and acoustic-phonetic features
Xue, Wei
van Hout, Roeland
Boogmans, Fleur
Ganzeboom, Mario
Cucchiarini, Catia
Strik, Helmer
[J]. INTERSPEECH 2021, 2021, : 2911 - 2915
[9] Dialectal Assamese Vowel Speech Detection using Acoustic Phonetic Features, KNN and RNN
Sharma, Mridusmita
Sarma, Kandarpa Kumar
[J]. 2ND INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN) 2015, 2015, : 674 - 678
[10] Characterizing Parkinson's Disease Speech by Acoustic and Phonetic Features
Proenca, Jorge
Veiga, Arlindo
Candeias, Sara
Lemos, Joao
Januario, Cristina
Perdigao, Fernando
[J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, 2014, 8775 : 24 - 35

← 1 2 3 4 5 →