Face database generation based on text-video correlation

被引:2
|
作者
Zeng, Dan [1 ]
Bao, Yixin [1 ]
Liu, Ke [1 ]
Zhao, Fan [1 ]
Tian, Qi [2 ]
机构
[1] Shanghai Univ, Key Lab Specialty Fiber Opt & Opt Access Netw, Shanghai, Peoples R China
[2] Univ Texas San Antonio, San Antonio, TX USA
基金
中国国家自然科学基金;
关键词
Face database generation; DNN; Text-video correlation; ALGORITHM; IDENTIFICATION;
D O I
10.1016/j.neucom.2016.05.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The size of databases is the key to success to face recognition systems. However, building such a database is both time-consuming and labor intensive. In this paper, we address the problem by proposing a database generation framework based on text-video correlation. Specifically, visual content of a video can be presented as a character sequence by face detection, tracking and recognition, while text information extracted from subtitles and scripts provides complementary identity sequence. By correlating these two sequences, faces recognized can be refined without manual intervention. Experiments demonstrate that 90% of the human effort in face database construction can be reduced. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:240 / 249
页数:10
相关论文
共 50 条
  • [31] Text driven face-video synthesis using GMM and spatial correlation
    Teferi, Dereje
    Faraj, Maycel L.
    Bigun, Josef
    [J]. IMAGE ANALYSIS, PROCEEDINGS, 2007, 4522 : 572 - +
  • [32] Reading and watching news online: Orienting to a text-video modality switch and recognition memory as a function of text structure and video intensity
    Wise, Kevin
    Bolls, Paul D.
    Myers, Justin
    Heaton, Rachel
    Pellot, Brian
    [J]. PSYCHOPHYSIOLOGY, 2007, 44 : S89 - S89
  • [33] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
    Yang, Xiangpeng
    Zhu, Linchao
    Wang, Xiaohan
    Yang, Yi
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6540 - 6548
  • [34] TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
    Liu, Yuqi
    Xiong, Pengfei
    Xu, Luhui
    Cao, Shengming
    Jin, Qin
    [J]. COMPUTER VISION - ECCV 2022, PT XIV, 2022, 13674 : 319 - 335
  • [35] A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval
    Falcon, Alex
    Serra, Giuseppe
    Lanz, Oswald
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4385 - 4394
  • [36] X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
    Gorti, Satya Krishna
    Vouitsis, Noel
    Ma, Junwei
    Golestan, Keyvan
    Volkovs, Maksims
    Garg, Animesh
    Yu, Guangwei
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4996 - 5005
  • [37] Controllable Video Generation With Text-Based Instructions
    Koksal, Ali
    Ak, Kenan E.
    Sun, Ying
    Rajan, Deepu
    Lim, Joo Hwee
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 190 - 201
  • [38] MGSGA: Multi-grained and Semantic-Guided Alignment for Text-Video Retrieval
    Wu, Xiaoyu
    Qian, Jiayao
    Yang, Lulu
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [39] Video Face Swap Based on Autoencoder Generation Network
    Yan, Shuqi
    He, Shaorong
    Lei, Xue
    Ye, Guanhua
    Xie, Zhifeng
    [J]. 2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 103 - 108
  • [40] A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database
    Huang, Zhiwu
    Shan, Shiguang
    Wang, Ruiping
    Zhang, Haihong
    Lao, Shihong
    Kuerban, Alifu
    Chen, Xilin
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5967 - 5981