Audio-visual speech translation with automatic LIP synchronization and face tracking based on 3-D head model

被引：0

作者：

Morishima, Shigeo ^{[1
]}

Ogata, Shin ^{[1
]}

Murai, Kazumasa ^{[1
]}

Nakamura, Satoshi ^{[1
]}

机构：

[1] ATR Spoken Language Translation Res., 2-2-2 Hikaridai Seika-cho, Soraku-gun Kyoto, 619-0288, Japan

来源：

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | 2002年 / 2卷

关键词：

Compendex;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Algorithms - Computer simulation - Face recognition - Interpolation - Speech recognition - Synchronization - Three dimensional computer graphics - Translation (languages)

引用

共 50 条

[1] Audio-visual speech translation with automatic lip synchronization and face tracking based on 3-D read model
Morishima, S
Ogata, S
Murai, K
Nakamura, S
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 2117 - 2120
[2] On the Audio-visual Synchronization for Lip-to-Speech Synthesis
Niu, Zhe
Mak, Brian
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 7809 - 7818
[3] Realtime lip contour tracking for audio-visual speech recognition applications
Yazdi, Mehran
Seyfi, Mehdi
Rafati, Amirhossein
Asadi, Meghdad
[J]. World Academy of Science, Engineering and Technology, 2009, 40 : 164 - 167
[4] Lip Tracking Method for the System of Audio-Visual Polish Speech Recognition
Kubanek, Mariusz
Bobulski, Janusz
Adrjanowicz, Lukasz
[J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I, 2012, 7267 : 535 - 542
[5] THE USE OF DYNAMIC DEFORMABLE TEMPLATES FOR LIP TRACKING IN AN AUDIO-VISUAL CORPUS WITH LARGE VARIATIONS IN HEAD POSE, FACE ILLUMINATION AND LIP SHAPES
Wu, Zhiyong
Wu, Jiying
Meng, Helen M.
[J]. 2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 370 - 373
[6] A Facial Feature and Lip Movement Enhanced Audio-Visual Speech Separation Model
Li, Guizhu
Fu, Min
Sun, Mengnan
Liu, Xuefeng
Zheng, Bing
[J]. SENSORS, 2023, 23 (21)
[7] A 3-D Audio-Visual Corpus of Affective Communication
Fanelli, Gabriele
Gall, Juergen
Romsdorfer, Harald
Weise, Thibaut
Van Gool, Luc
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (06) : 591 - 598
[8] Multimodal Learning Using 3D Audio-Visual Data or Audio-Visual Speech Recognition
Su, Rongfeng
Wang, Lan
Liu, Xunying
[J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 40 - 43
[9] Audio-visual speech recognition integrating 3D lip information obtained from the Kinect
Wang, Jianrong
Zhang, Ju
Honda, Kiyoshi
Wei, Jianguo
Dang, Jianwu
[J]. MULTIMEDIA SYSTEMS, 2016, 22 (03) : 315 - 323
[10] Audio-visual speech recognition integrating 3D lip information obtained from the Kinect
Jianrong Wang
Ju Zhang
Kiyoshi Honda
Jianguo Wei
Jianwu Dang
[J]. Multimedia Systems, 2016, 22 : 315 - 323

← 1 2 3 4 5 →