Fusion of Speech, Faces and Text for Person Identification in TV Broadcast

被引:0
|
作者
Bredin, Herve [1 ]
Poignant, Johann [2 ]
Tapaswi, Makarand [3 ]
Fortier, Guillaume [4 ]
Viet Bac Le [5 ]
Napoleon, Thibault [6 ]
Gao, Hua [3 ]
Barras, Claude [1 ]
Rosset, Sophie [1 ]
Besacier, Laurent [2 ]
Verbeek, Jakob [4 ]
Quenot, Georges [2 ]
Jurie, Frederic [6 ]
Ekenel, Hazim Kemal [3 ]
机构
[1] Univ Paris 11, CNRS, UPR 3251, LIMSI, BP 133, F-91403 Orsay, France
[2] UJF Grenoble 1, UPMF Grenoble 2, Grenoble INP, CNRS,UMR 5217,LIG, F-38041 Grenoble, France
[3] Karlsruher Inst Technol, Karlsruhe, Germany
[4] INRIA Rhone Alpes, F-38330 Montbonnot St Martin, France
[5] Vocapia Res, F-91400 Orsay, France
[6] Univ Caen, GREYC, UMR 6072, F-14050 Caen, France
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Repere challenge is a project aiming at the evaluation of systems for supervised and unsupervised multimodal recognition of people in TV broadcast. In this paper, we describe, evaluate and discuss QCompere consortium submissions to the 2012 Repere evaluation campaign dry-run. Speaker identification (and face recognition) can be greatly improved when combined with name detection through video optical character recognition. Moreover, we show that unsupervised multimodal person recognition systems can achieve performance nearly as good as supervised monomodal ones (with several hundreds of identity models).
引用
收藏
页码:385 / 394
页数:10
相关论文
共 50 条
  • [1] Improving Speaker Identification in TV-shows using person name detection in overlaid text and speech
    Charlet, Delphine
    Fredouille, Corinne
    Damnati, Geraldine
    Senay, Gregory
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2777 - 2781
  • [2] Fusion of speech and handwritten signatures biometrics for person identification
    Abushariah A.A.M.
    Abushariah M.A.M.
    Gunawan T.S.
    Chebil J.
    Alqudah A.A.M.
    Ting H.-N.
    Mustafa M.B.P.
    International Journal of Speech Technology, 2023, 26 (04) : 833 - 850
  • [3] Introducing FoxPersonTracks: a Benchmark for Person Re-Identification from TV Broadcast Shows
    Auguste, Remi
    Tirilly, Pierre
    Martinet, Jean
    2015 13TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2015,
  • [4] Person identification in TV programs
    Li, DG
    Wei, G
    Sethi, IK
    Dimitrova, N
    JOURNAL OF ELECTRONIC IMAGING, 2001, 10 (04) : 930 - 938
  • [5] Overlay Text Extraction From TV News Broadcast
    Kannao, Raghvendra
    Guha, Prithwijit
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [6] ALIF: A Dataset for Arabic Embedded Text Recognition in TV Broadcast
    Yousfi, Sonia
    Berrani, Sid-Ahmed
    Garcia, Christophe
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1221 - 1225
  • [7] Feature Level Fusion of Speech and Face Image based Person Identification System
    Sugiarta, Gunawan Y. B.
    Trilaksono, Bambang Riyanto
    Hendrawan
    Suhardi
    2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS: ICCEA 2010, PROCEEDINGS, VOL 2, 2010, : 221 - 225
  • [8] Simultaneous Synchronization of Text and Speech for Broadcast News Subtitling
    Gao, Jie
    Zhao, Qingwei
    Li, Ta
    Yan, Yonghong
    ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 3, PROCEEDINGS, 2009, 5553 : 576 - 585
  • [9] Multimodal Feature Hierarchical Fusion for Text-Image Person Re-identification
    Li, Jiaxuan
    Huang, Likun
    Zhu, Chuanhu
    Zhang, Song
    Li, Qiang
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 468 - 481
  • [10] Collaborative Annotation for Person Identification in TV Shows
    Budnik, Matheuz
    Besacier, Laurent
    Poignant, Johann
    Bredin, Herve
    Barras, Claude
    Stefas, Mickael
    Bruneau, Pierrick
    Tamisier, Thomas
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2607 - 2608