Person Search with Natural Language Description

被引:213
|
作者
Li, Shuang [1 ]
Xiao, Tong [1 ]
Li, Hongsheng [1 ]
Zhou, Bolei [2 ]
Yue, Dayu [3 ]
Wang, Xiaogang [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
[2] MIT, Cambridge, MA 02139 USA
[3] SenseTime Grp Ltd, Hong Kong, Hong Kong, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
D O I
10.1109/CVPR.2017.551
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Searching persons in large-scale image databases with the query of natural language description has important applications in video surveillance. Existing methods mainly focused on searching persons with image-based or attribute-based queries, which have major limitations for a practical usage. In this paper, we study the problem of person search with natural language description. Given the textual description of a person, the algorithm of the person search is required to rank all the samples in the person database then retrieve the most relevant sample corresponding to the queried description. Since there is no person dataset or benchmark with textual description available, we collect a large-scale person description dataset with detailed natural language annotations and person samples from various sources, termed as CUHK Person Description Dataset (CUHK-PEDES). A wide range of possible models and baselines have been evaluated and compared on the person search benchmark. An Recurrent Neural Network with Gated Neural Attention mechanism (GNARNN) is proposed to establish the state-of-the art performance on person search.
引用
收藏
页码:5187 / 5196
页数:10
相关论文
共 50 条
  • [1] Multimodal Alignment and Attention-Based Person Search via Natural Language Description
    Ji, Zhong
    Li, Shengjia
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (11) : 11147 - 11156
  • [2] Interactive Natural Language-Based Person Search
    Shree, Vikram
    Chao, Wei-Lun
    Campbell, Mark
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02): : 1851 - 1858
  • [3] Improving Natural Language Person Description Search from Videos with Language Model Fine-Tuning and Approximate Nearest Neighbor
    Yuenyong, Sumeth
    Wongpatikaseree, Konlakorn
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (04)
  • [4] Adversarial Attribute-Text Embedding for Person Search With Natural Language Query
    Zha, Zheng-Jun
    Liu, Jiawei
    Chen, Di
    Wu, Feng
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1836 - 1846
  • [5] Natural Language Person Retrieval
    Zhou, Tao
    Yu, Jie
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 5023 - 5024
  • [6] Fusion-Attention Network for person search with free-form natural language
    Ji, Zhong
    Li, Shengjia
    Pang, Yanwei
    [J]. PATTERN RECOGNITION LETTERS, 2018, 116 : 205 - 211
  • [7] Towards a large-scale person search by vietnamese natural language: dataset and methods
    Thi Thanh Thuy Pham
    Hong-Quan Nguyen
    Hoai Phan
    Thi-Ngoc-Diep Do
    Thuy-Binh Nguyen
    Thanh-Hai Tran
    Thi-Lan Le
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (19) : 27569 - 27600
  • [8] Towards a large-scale person search by vietnamese natural language: dataset and methods
    Thi Thanh Thuy Pham
    Hong-Quan Nguyen
    Hoai Phan
    Thi-Ngoc-Diep Do
    Thuy-Binh Nguyen
    Thanh-Hai Tran
    Thi-Lan Le
    [J]. Multimedia Tools and Applications, 2022, 81 : 27569 - 27600
  • [9] Person Tube Retrieval via Language Description
    Fan, Hehe
    Yang, Yi
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10754 - 10761
  • [10] ALGORITHMIC DESCRIPTION OF NATURAL LANGUAGE
    LONGUETH.HC
    [J]. PROCEEDINGS OF THE ROYAL SOCIETY SERIES B-BIOLOGICAL SCIENCES, 1972, 182 (1068): : 255 - +