Fusion of Speech, Faces and Text for Person Identification in TV Broadcast

被引：0

作者：

Bredin, Herve ^{[1
]}

Poignant, Johann ^{[2
]}

Tapaswi, Makarand ^{[3
]}

Fortier, Guillaume ^{[4
]}

Viet Bac Le ^{[5
]}

Napoleon, Thibault ^{[6
]}

Gao, Hua ^{[3
]}

Barras, Claude ^{[1
]}

Rosset, Sophie ^{[1
]}

Besacier, Laurent ^{[2
]}

Verbeek, Jakob ^{[4
]}

Quenot, Georges ^{[2
]}

Jurie, Frederic ^{[6
]}

Ekenel, Hazim Kemal ^{[3
]}

机构：

[1] Univ Paris 11, CNRS, UPR 3251, LIMSI, BP 133, F-91403 Orsay, France

[2] UJF Grenoble 1, UPMF Grenoble 2, Grenoble INP, CNRS,UMR 5217,LIG, F-38041 Grenoble, France

[3] Karlsruher Inst Technol, Karlsruhe, Germany

[4] INRIA Rhone Alpes, F-38330 Montbonnot St Martin, France

[5] Vocapia Res, F-91400 Orsay, France

[6] Univ Caen, GREYC, UMR 6072, F-14050 Caen, France

来源：

COMPUTER VISION - ECCV 2012, PT III | 2012年 / 7585卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Repere challenge is a project aiming at the evaluation of systems for supervised and unsupervised multimodal recognition of people in TV broadcast. In this paper, we describe, evaluate and discuss QCompere consortium submissions to the 2012 Repere evaluation campaign dry-run. Speaker identification (and face recognition) can be greatly improved when combined with name detection through video optical character recognition. Moreover, we show that unsupervised multimodal person recognition systems can achieve performance nearly as good as supervised monomodal ones (with several hundreds of identity models).

引用

页码：385 / 394

页数：10

共 50 条

[1] Improving Speaker Identification in TV-shows using person name detection in overlaid text and speech
Charlet, Delphine
Fredouille, Corinne
Damnati, Geraldine
Senay, Gregory
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2777 - 2781
[2] Fusion of speech and handwritten signatures biometrics for person identification
Abushariah A.A.M.
Abushariah M.A.M.
Gunawan T.S.
Chebil J.
Alqudah A.A.M.
Ting H.-N.
Mustafa M.B.P.
International Journal of Speech Technology, 2023, 26 (04) : 833 - 850
[3] Introducing FoxPersonTracks: a Benchmark for Person Re-Identification from TV Broadcast Shows
Auguste, Remi
Tirilly, Pierre
Martinet, Jean
2015 13TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2015,
[4] Person identification in TV programs
Li, DG
Wei, G
Sethi, IK
Dimitrova, N
JOURNAL OF ELECTRONIC IMAGING, 2001, 10 (04) : 930 - 938
[5] Overlay Text Extraction From TV News Broadcast
Kannao, Raghvendra
Guha, Prithwijit
2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
[6] ALIF: A Dataset for Arabic Embedded Text Recognition in TV Broadcast
Yousfi, Sonia
Berrani, Sid-Ahmed
Garcia, Christophe
2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1221 - 1225
[7] Feature Level Fusion of Speech and Face Image based Person Identification System
Sugiarta, Gunawan Y. B.
Trilaksono, Bambang Riyanto
Hendrawan
Suhardi
2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS: ICCEA 2010, PROCEEDINGS, VOL 2, 2010, : 221 - 225
[8] Simultaneous Synchronization of Text and Speech for Broadcast News Subtitling
Gao, Jie
Zhao, Qingwei
Li, Ta
Yan, Yonghong
ADVANCES IN NEURAL NETWORKS - ISNN 2009, PT 3, PROCEEDINGS, 2009, 5553 : 576 - 585
[9] Multimodal Feature Hierarchical Fusion for Text-Image Person Re-identification
Li, Jiaxuan
Huang, Likun
Zhu, Chuanhu
Zhang, Song
Li, Qiang
PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 468 - 481
[10] Collaborative Annotation for Person Identification in TV Shows
Budnik, Matheuz
Besacier, Laurent
Poignant, Johann
Bredin, Herve
Barras, Claude
Stefas, Mickael
Bruneau, Pierrick
Tamisier, Thomas
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2607 - 2608

← 1 2 3 4 5 →