A multiple deformable template approach for visual speech recognition

被引：0

作者：

Chandramohan, D

Silsbee, PL

机构：

来源：

ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4 | 1996年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose an improved deformable template algorithm for modeling the shape of a talker's mouth. We use a two step approach which begins by classifying mouth images into broad categories. The classification procedure yields both a set of template parameters (in effect, a unique template) and a set of initial conditions. The second step is to allow the deformable template to converge using standard techniques. The multi-model approach is significantly more flexible than single-model approaches and consistently provides better solutions. We present examples of single and multiple template solutions which support this statement. In a small recognition experiment, recognition of consonants improved from 16% to 33%, based only on visual information, when multiple templates were used.

引用

页码：50 / 53

页数：4

共 50 条

[31] DTW Speech Recognition Algorithm of Optimization Template Matching
Zhang, Jing
Qin, Benzhuo
2012 WORLD AUTOMATION CONGRESS (WAC), 2012,
[32] Integrate Template Matching and Statistical Modeling for Speech Recognition
Sun, Xie
Zhao, Yunxin
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 74 - 77
[33] Visual speech information for face recognition
Lawrence D. Rosenblum
Deborah A. Yakel
Naser Baseer
Anjani Panchal
Brynn C. Nodarse
Ryan P. Niehus
Perception & Psychophysics, 2002, 64 : 220 - 229
[34] RESOLUTION LIMITS ON VISUAL SPEECH RECOGNITION
Bear, Helen L.
Harvey, Richard
Theobald, Barry-John
Lan, Yuxuan
2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 1371 - 1375
[35] TELEVISED VISUAL CONTRIBUTION TO SPEECH RECOGNITION
BROADBENT, D
IEEE TRANSACTIONS ON EDUCATION, 1970, E 13 (02) : 79 - +
[36] Visual Hallucination Elevates Speech Recognition
Zhang, Fang
Zhu, Yongxin
Wang, Xiangxiang
Chen, Huang
Sun, Xing
Xu, Linli
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19542 - 19550
[37] Boundary Descriptors for Visual Speech Recognition
Gupta, Deepika
Singh, Preety
Laxmi, V.
Gaur, Manoj S.
COMPUTER AND INFORMATION SCIENCES II, 2012, : 307 - 313
[38] Visual speech information for face recognition
Rosenblum, LD
Yakel, DA
Baseer, N
Panchal, A
Nodarse, BC
Niehus, RP
PERCEPTION & PSYCHOPHYSICS, 2002, 64 (02): : 220 - 229
[39] A deformable template approach to detecting straight edges in radar images
Lakshmanan, S
Grimmer, D
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (04) : 438 - 443
[40] Face Recognition:A Template Based Approach
Archana, T.
Venugopal, T.
2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 966 - 969

← 1 2 3 4 5 →