Fully automatic face recognition system using a combined audio-visual approach

被引:6
|
作者
Albiol, A [1 ]
Torres, L
Delp, EJ
机构
[1] Univ Politecn Valencia, Dept Commun, Valencia, Spain
[2] Tech Univ Catalonia, Dept Signal Theory & Commun, Barcelona, Spain
[3] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
来源
关键词
D O I
10.1049/ip-vis:20045082
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a novel audio and video information fusion approach that greatly improves automatic recognition of people in video sequences. To that end, audio and video information is first used independently to obtain confidence values that indicate the likelihood that a specific person appears in a video shot. Finally, a post-classifier is applied to fuse audio and visual confidence values. The system has been tested on several newssequences and the results indicate that a significant improvement in the recognition rate can be achieved when both modalities are used together.
引用
收藏
页码:318 / 326
页数:9
相关论文
共 50 条
  • [41] Audio-visual speech recognition using MPEGA compliant visual features
    Aleksic, PS
    Williams, JJ
    Wu, ZL
    Katsaggelos, AK
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1213 - 1227
  • [42] Deep Audio-Visual Speech Recognition
    Afouras, Triantafyllos
    Chung, Joon Son
    Senior, Andrew
    Vinyals, Oriol
    Zisserman, Andrew
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 8717 - 8727
  • [43] Audio-visual spontaneous emotion recognition
    Zeng, Zhihong
    Hu, Yuxiao
    Roisman, Glenn I.
    Wen, Zhen
    Fu, Yun
    Huang, Thomas S.
    ARTIFICIAL INTELLIGENCE FOR HUMAN COMPUTING, 2007, 4451 : 72 - +
  • [44] Audio-visual integration for speech recognition
    Kober, R
    Harz, U
    NEUROLOGY PSYCHIATRY AND BRAIN RESEARCH, 1996, 4 (04) : 179 - 184
  • [45] Audio-visual affective expression recognition
    Huang, Thomas S.
    Zeng, Zhihong
    MIPPR 2007: PATTERN RECOGNITION AND COMPUTER VISION, 2007, 6788
  • [46] MULTIPOSE AUDIO-VISUAL SPEECH RECOGNITION
    Estellers, Virginia
    Thiran, Jean-Philippe
    19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1065 - 1069
  • [47] Lip movement synthesis in audio-visual speech recognition system
    Li, JQ
    Yin, YX
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 461 - 465
  • [48] Automatic Piano Music Transcription Using Audio-Visual Features
    WAN Yulong
    WANG Xianliang
    ZHOU Ruohua
    YAN Yonghong
    ChineseJournalofElectronics, 2015, 24 (03) : 596 - 603
  • [49] Audio-Visual Recognition of Pain Intensity
    Thiam, Patrick
    Kessler, Viktor
    Walter, Steffen
    Palm, Guenther
    Schwenker, Friedhelm
    MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, MPRSS 2016, 2017, 10183 : 110 - 126
  • [50] Automatic Piano Music Transcription Using Audio-Visual Features
    Wan Yulong
    Wang Xianliang
    Zhou Ruohua
    Yan Yonghong
    CHINESE JOURNAL OF ELECTRONICS, 2015, 24 (03) : 596 - 603