Fully automatic face recognition system using a combined audio-visual approach

被引:6
|
作者
Albiol, A [1 ]
Torres, L
Delp, EJ
机构
[1] Univ Politecn Valencia, Dept Commun, Valencia, Spain
[2] Tech Univ Catalonia, Dept Signal Theory & Commun, Barcelona, Spain
[3] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
来源
关键词
D O I
10.1049/ip-vis:20045082
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents a novel audio and video information fusion approach that greatly improves automatic recognition of people in video sequences. To that end, audio and video information is first used independently to obtain confidence values that indicate the likelihood that a specific person appears in a video shot. Finally, a post-classifier is applied to fuse audio and visual confidence values. The system has been tested on several newssequences and the results indicate that a significant improvement in the recognition rate can be achieved when both modalities are used together.
引用
收藏
页码:318 / 326
页数:9
相关论文
共 50 条
  • [1] An audio-visual speech recognition system for testing new audio-visual databases
    Pao, Tsang-Long
    Liao, Wen-Yuan
    VISAPP 2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2006, : 192 - +
  • [2] Audio-Visual Automatic Speech Recognition for Connected Digits
    Wang, Xiaoping
    Hao, Yufeng
    Fu, Degang
    Yuan, Chunwei
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL III, PROCEEDINGS, 2008, : 328 - +
  • [3] Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes
    Korshunov, Pavel
    Chen, Haolin
    Garner, Philip N.
    Marcel, Sebastien
    2023 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS, IJCB, 2023,
  • [4] An audio-visual corpus for multimodal automatic speech recognition
    Czyzewski, Andrzej
    Kostek, Bozena
    Bratoszewski, Piotr
    Kotus, Jozef
    Szykulski, Marcin
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 49 (02) : 167 - 192
  • [5] An audio-visual corpus for multimodal automatic speech recognition
    Andrzej Czyzewski
    Bozena Kostek
    Piotr Bratoszewski
    Jozef Kotus
    Marcin Szykulski
    Journal of Intelligent Information Systems, 2017, 49 : 167 - 192
  • [6] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection
    Tamura, Satoshi
    Ishikawa, Masato
    Hashiba, Takashi
    Takeuchi, Shin'ichi
    Hayamizu, Satoru
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +
  • [7] Audio-Visual Emotion Recognition using Gaussian Mixture Models for Face and Voice
    Metallinou, Angeliki
    Lee, Sungbok
    Narayanan, Shrikanth
    ISM: 2008 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, 2008, : 250 - 257
  • [8] Audio-Visual Automatic Speech Recognition Using PZM, MFCC and Statistical Analysis
    Debnath, Saswati
    Roy, Pinki
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2021, 7 (02): : 121 - 133
  • [9] Automatic Visual Feature Extraction for Mandarin Audio-Visual Speech Recognition
    Pao, Tsang-Long
    Liao, Wen-Yuan
    Wu, Tsan-Nung
    Lin, Ching-Yi
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 2936 - 2940
  • [10] Audio-Visual Recognition System in Compression Domain
    Wong, Yee Wan
    Seng, Kah Phooi
    Ang, Li-Minn
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (05) : 637 - 646