Fusion of Speech, Faces and Text for Person Identification in TV Broadcast

被引:0
|
作者
Bredin, Herve [1 ]
Poignant, Johann [2 ]
Tapaswi, Makarand [3 ]
Fortier, Guillaume [4 ]
Viet Bac Le [5 ]
Napoleon, Thibault [6 ]
Gao, Hua [3 ]
Barras, Claude [1 ]
Rosset, Sophie [1 ]
Besacier, Laurent [2 ]
Verbeek, Jakob [4 ]
Quenot, Georges [2 ]
Jurie, Frederic [6 ]
Ekenel, Hazim Kemal [3 ]
机构
[1] Univ Paris 11, CNRS, UPR 3251, LIMSI, BP 133, F-91403 Orsay, France
[2] UJF Grenoble 1, UPMF Grenoble 2, Grenoble INP, CNRS,UMR 5217,LIG, F-38041 Grenoble, France
[3] Karlsruher Inst Technol, Karlsruhe, Germany
[4] INRIA Rhone Alpes, F-38330 Montbonnot St Martin, France
[5] Vocapia Res, F-91400 Orsay, France
[6] Univ Caen, GREYC, UMR 6072, F-14050 Caen, France
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Repere challenge is a project aiming at the evaluation of systems for supervised and unsupervised multimodal recognition of people in TV broadcast. In this paper, we describe, evaluate and discuss QCompere consortium submissions to the 2012 Repere evaluation campaign dry-run. Speaker identification (and face recognition) can be greatly improved when combined with name detection through video optical character recognition. Moreover, we show that unsupervised multimodal person recognition systems can achieve performance nearly as good as supervised monomodal ones (with several hundreds of identity models).
引用
收藏
页码:385 / 394
页数:10
相关论文
共 50 条
  • [41] "Knock! Knock! Who is it?" Probabilistic Person Identification in TV-Series
    Tapaswi, Makarand
    Baeuml, Martin
    Stiefelhagen, Rainer
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2658 - 2665
  • [42] HEARING FACES: TARGET SPEAKER TEXT-TO-SPEECH SYNTHESIS FROM A FACE
    Pluester, Bjoern
    Weber, Cornelius
    Qu, Leyuan
    Wermter, Stefan
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 757 - 764
  • [43] Person recognition and identification: Names and faces compared through time of fame
    Lucchelli, F.
    Bizzozero, I.
    Saetti, M. C.
    Spinnler, H.
    EUROPEAN JOURNAL OF NEUROLOGY, 2006, 13 : 132 - 132
  • [44] Explainable Multimodal Fusion for Dementia Detection From Text and Speech
    Altinok, Duygu
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 236 - 251
  • [45] Automatic Identification of Person using fusion of Gait features
    Sivarathinabala, M.
    Abirami, S.
    2014 INTERNATIONAL CONFERENCE ON SCIENCE ENGINEERING AND MANAGEMENT RESEARCH (ICSEMR), 2014,
  • [46] Efficient Person Identification by Fusion of Multiple Palmprint Representations
    Meraoumia, Abdallah
    Chitroub, Salim
    Bouridane, Ahmed
    IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 182 - +
  • [47] Distributed Signature Fusion for Person Re-Identification
    Martinel, Niki
    Micheloni, Christian
    Piciarelli, Claudio
    2012 SIXTH INTERNATIONAL CONFERENCE ON DISTRIBUTED SMART CAMERAS (ICDSC), 2012,
  • [48] Distance Penalization and Fusion for Person Re-identification
    Mirmahboub, Behzad
    Mekhalfi, Mohamed Lamine
    Murino, Vittorio
    2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, : 1306 - 1314
  • [49] A FEATURE FUSION STRATEGY FOR PERSON RE-IDENTIFICATION
    Gao, Mu
    Ai, Haizhou
    Bai, Bo
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4274 - 4278
  • [50] Dual Network Fusion for Person Re-Identification
    Du, Lin
    Tian, Chang
    Zeng, Mingyong
    Wang, Jiabao
    Jiao, Shanshan
    Shen, Qing
    Wu, Guodong
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2020, E103A (03) : 643 - 648