Text-to-visual speech synthesis based on parameter generation from HMM

被引：0

作者：

Masuko, T ^{[1
]}

Kobayashi, T ^{[1
]}

Tamura, M ^{[1
]}

Masubuchi, J ^{[1
]}

Tokuda, K ^{[1
]}

机构：

[1] Tokyo Inst Technol, Precis & Intelligence Lab, Yokohama, Kanagawa 2268503, Japan

来源：

PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6 | 1998年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a new technique for synthesizing visual speech from arbitrarily given text. The technique is based on an algorithm for parameter generation from HMM with dynamic features, which has been successfully applied to text-to-speech synthesis. In the training phase, syllable HMMs are trained with visual speech parameter sequences that represent lip movements. In the synthesis phase, a sentence HMM is constructed by concatenating syllable HMMs corresponding to the phonetic transcription for the input text. Then an optimum visual speech parameter sequence is generated from the sentence HMM in ML sense. The proposed technique can generate synchronized lip movements with speech in a unified framework. Furthermore, coarticulation is implicitly incorporated into generated mouth shapes. As a result, synthetic lip motion becomes smooth and realistic.

引用

页码：3745 / 3748

页数：4

共 50 条

[41] Subjective analysis of an HMM-based visual speech synthesizer
Williams, JJ
Katsaggelos, AK
Garstecki, DC
HUMAN VISION AND ELECTRONIC IMAGING VI, 2001, 4299 : 544 - 555
[42] DEMONSTRATION OF AN HMM-BASED PHOTOREALISTIC EXPRESSIVE AUDIO-VISUAL SPEECH SYNTHESIS SYSTEM
Filntisis, Panagiotis Paraskevas
Katsamanis, Athanasios
Maragos, Petros
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 4588 - 4588
[43] Normalized training for HMM-based visual speech recognition
Nankaku, Yoshihiko
Tokuda, Keiichi
Kitamura, Tadashi
Kobayashi, Takao
ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2006, 89 (11): : 40 - 50
[44] Normalized training for HMM-based visual speech recognition
Nankaku, Y
Tokuda, K
Kitamura, T
Kobayashi, T
2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 234 - 237
[45] HMM-based distributed text-to-speech synthesis incorporating speaker-adaptive training
Jeon, Kwang Myung
Choi, Seung Ho
International Journal of Multimedia and Ubiquitous Engineering, 2014, 9 (05): : 107 - 119
[46] A Novel Text-to-Speech Synthesis System Using Syllable-Based HMM for Tamil Language
Manoharan, J. Samuel
PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON SUSTAINABLE EXPERT SYSTEMS (ICSES 2021), 2022, 351 : 305 - 314
[47] Minimum generation error linear regression based model adaptation for HMM-based speech synthesis
Qin, Long
Wu, Yi-Jian
Ling, Zhen-Hua
Wang, Ren-Hua
Da, Li-Rong
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3953 - +
[48] Speaker-adaptive visual speech synthesis in the HMM-framework
Schabus, Dietmar
Pucher, Michael
Hofer, Gregor
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 978 - 981
[49] Synthesis of stressed speech from isolated neutral speech using HMM-based models
BouGhazale, SE
Hansen, JHL
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1860 - 1863
[50] An HMM-based Mandarin Chinese Text-to-Speech system
Qian, Yao
Soong, Frank
Chen, Yining
Chu, Min
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 223 - +

← 1 2 3 4 5 →