Person Search with Natural Language Description

被引:213
|
作者
Li, Shuang [1 ]
Xiao, Tong [1 ]
Li, Hongsheng [1 ]
Zhou, Bolei [2 ]
Yue, Dayu [3 ]
Wang, Xiaogang [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
[2] MIT, Cambridge, MA 02139 USA
[3] SenseTime Grp Ltd, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
D O I
10.1109/CVPR.2017.551
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Searching persons in large-scale image databases with the query of natural language description has important applications in video surveillance. Existing methods mainly focused on searching persons with image-based or attribute-based queries, which have major limitations for a practical usage. In this paper, we study the problem of person search with natural language description. Given the textual description of a person, the algorithm of the person search is required to rank all the samples in the person database then retrieve the most relevant sample corresponding to the queried description. Since there is no person dataset or benchmark with textual description available, we collect a large-scale person description dataset with detailed natural language annotations and person samples from various sources, termed as CUHK Person Description Dataset (CUHK-PEDES). A wide range of possible models and baselines have been evaluated and compared on the person search benchmark. An Recurrent Neural Network with Gated Neural Attention mechanism (GNARNN) is proposed to establish the state-of-the art performance on person search.
引用
下载
收藏
页码:5187 / 5196
页数:10
相关论文
共 50 条
  • [21] Natural Language Search of Sensor Data
    Zhang, Keyi
    Marchiori, Alan
    2016 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATION WORKSHOPS (PERCOM WORKSHOPS), 2016,
  • [22] The Role of the Input in Natural Language Video Description
    Cascianelli, Silvia
    Costante, Gabriele
    Devo, Alessandro
    Ciarfuglia, Thomas A.
    Valigi, Paolo
    Fravolini, Mario L.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (01) : 271 - 283
  • [23] Attention-based Natural Language Person Retrieval
    Zhou, Tao
    Chen, Muhao
    Yu, Jie
    Terzopoulos, Demetri
    2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 27 - 34
  • [24] Automatic description of static images in natural language
    Rendón, AM
    Luna, PS
    Salgado, GR
    Serna, JGG
    Covarrubias, RF
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGS, 2005, 3513 : 398 - 401
  • [25] The Applications of Description Logics in Natural Language Processing
    Cheng Xian-Yi
    Cheng Chen
    Zhu Qian
    ADVANCED RESEARCH ON INDUSTRY, INFORMATION SYSTEMS AND MATERIAL ENGINEERING, PTS 1-7, 2011, 204-210 : 381 - +
  • [26] Natural Language Description of Videos for Smart Surveillance
    Dilawari, Aniqa
    Khan, Muhammad Usman Ghani
    Al-Otaibi, Yasser D.
    Rehman, Zahoor-ur
    Rahman, Atta-ur
    Nam, Yunyoung
    APPLIED SCIENCES-BASEL, 2021, 11 (09):
  • [27] The Applications of Description Logics in Natural Language Processing
    Cheng Xian-Yi
    Cheng Chen
    Zhu Qian
    ADVANCED MATERIALS SCIENCE AND TECHNOLOGY, PTS 1-2, 2011, 181-182 : 236 - +
  • [28] Natural language agreement description for reversible grammars
    Diaconescu, S
    AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2003, 2903 : 161 - 172
  • [29] Refined Knowledge Transfer for Language-Based Person Search
    Wu, Ziqiang
    Ma, Bingpeng
    Chang, Hong
    Shan, Shiguang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9315 - 9329
  • [30] Hybrid Attention Network for Language-Based Person Search
    Li, Yang
    Xu, Huahu
    Xiao, Junsheng
    SENSORS, 2020, 20 (18) : 1 - 23