A multiple deformable template approach for visual speech recognition

被引:0
|
作者
Chandramohan, D
Silsbee, PL
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose an improved deformable template algorithm for modeling the shape of a talker's mouth. We use a two step approach which begins by classifying mouth images into broad categories. The classification procedure yields both a set of template parameters (in effect, a unique template) and a set of initial conditions. The second step is to allow the deformable template to converge using standard techniques. The multi-model approach is significantly more flexible than single-model approaches and consistently provides better solutions. We present examples of single and multiple template solutions which support this statement. In a small recognition experiment, recognition of consonants improved from 16% to 33%, based only on visual information, when multiple templates were used.
引用
收藏
页码:50 / 53
页数:4
相关论文
共 50 条
  • [41] Multiple cameras audio visual speech recognition using active appearance model visual features in car environment
    Biswas A.
    Sahu P.K.
    Chandra M.
    Biswas, Astik (astikbiswas@live.com), 1600, Springer Science and Business Media, LLC (19): : 159 - 171
  • [42] An evaluation of visual speech features for the tasks of speech and speaker recognition
    Lucey, S
    AUDIO-BASED AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 260 - 267
  • [43] Audio-Visual Speech Modeling for Continuous Speech Recognition
    Dupont, Stephane
    Luettin, Juergen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) : 141 - 151
  • [44] Effect of Various Visual Speech Units on Language Identification Using Visual Speech Recognition
    Brahme, Aparna
    Bhadade, Umesh
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2020, 20 (04)
  • [45] MICROPROCESSOR APPROACH TO SPEECH RECOGNITION
    SEYMOUR, J
    GATWARD, JF
    ACUSTICA, 1977, 37 (01): : 57 - 58
  • [46] AN APPROACH TO THE AUTOMATIC RECOGNITION OF SPEECH
    PAY, BE
    EVANS, CR
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1981, 14 (01): : 13 - 27
  • [47] Dynamic visual features based on discriminative speech class projection for visual speech recognition
    Lei, X
    Cai, XL
    Fu, ZH
    Zhao, RC
    PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 687 - 690
  • [48] A Novel Speech Recognition Approach Based on Multiple Modeling by Hidden Markov Models
    Samira, Hazmoune
    Fateh, Bougamouza
    Smaine, Mazouzi
    Mohamed, Benmohammed
    2013 INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS TECHNOLOGY (ICCAT), 2013,
  • [49] Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition
    Caballero-Morales, Santiago-Omar
    Trujillo-Romero, Felipe
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (03) : 841 - 852
  • [50] Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach
    Miao, Yajie
    Metze, Florian
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3414 - 3418