Attention-based Natural Language Person Retrieval

被引:8
|
作者
Zhou, Tao [1 ]
Chen, Muhao [1 ]
Yu, Jie [2 ]
Terzopoulos, Demetri [1 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA 90095 USA
[2] SAIC Innovat Ctr, San Jose, CA USA
关键词
D O I
10.1109/CVPRW.2017.10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Following the recent progress in image classification and captioning using deep learning, we develop a novel natural language person retrieval system based on an attention mechanism. More specifically, given the description of a person, the goal is to localize the person in an image. To this end, we first construct a benchmark dataset for natural language person retrieval. To do so, we generate bounding boxes for persons in a public image dataset from the segmentation masks, which are then annotated with descriptions and attributes using the Amazon Mechanical Turk. We then adopt a region proposal network in Faster R-CNN as a candidate region generator. The cropped images based on the region proposals as well as the whole images with attention weights are fed into Convolutional Neural Networks for visual feature extraction, while the natural language expression and attributes are input to Bidirectional Long Short-Term Memory (BLSTM) models for text feature extraction. The visual and text features are integrated to score region proposals, and the one with the highest score is retrieved as the output of our system. The experimental results show significant improvement over the state-of-the-art method for generic object retrieval and this line of research promises to benefit search in surveillance video footage.
引用
收藏
页码:27 / 34
页数:8
相关论文
共 50 条
  • [1] Multimodal Alignment and Attention-Based Person Search via Natural Language Description
    Ji, Zhong
    Li, Shengjia
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (11) : 11147 - 11156
  • [2] Natural Language Person Retrieval
    Zhou, Tao
    Yu, Jie
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 5023 - 5024
  • [3] Unseen Filler Generalization In Attention-based Natural Language Reasoning Models
    Chen, Chin-Hui
    Fu, Yi-Fu
    Cheng, Hsiao-Hua
    Lin, Shou-De
    [J]. 2020 IEEE SECOND INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2020), 2020, : 42 - 51
  • [4] Attention-Based Genetic Algorithm for Adversarial Attack in Natural Language Processing
    Zhou, Shasha
    Li, Ke
    Min, Geyong
    [J]. PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVII, PPSN 2022, PT I, 2022, 13398 : 341 - 355
  • [5] Online adaptation of an attention-based neural network for natural language generation
    Riou, Matthieu
    Jabaian, Bassam
    Huet, Stephane
    Lefevre, Fabrice
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3344 - 3348
  • [6] Visual Interrogation of Attention-Based Models for Natural Language Inference and Machine Comprehension
    Liu, Shusen
    Li, Tao
    Li, Zhimin
    Srikumar, Vivek
    Pascucci, Valerio
    Bremer, Peer-Timo
    [J]. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2018, : 36 - 41
  • [7] BERT for the Processing of Radiological Reports: An Attention-based Natural Language Processing Algorithm
    Soffer, Shelly
    Glicksberg, Benjamin S.
    Zimlichman, Eyal
    Klang, Eyal
    [J]. ACADEMIC RADIOLOGY, 2022, 29 (04) : 634 - 635
  • [8] Person image generation with attention-based injection network
    Liu, Meichen
    Wang, Kejun
    Ji, Ruihang
    Ge, Shuzhi Sam
    Chen, Jing
    [J]. NEUROCOMPUTING, 2021, 460 : 345 - 359
  • [9] Image Retrieval Using a Deep Attention-Based Hash
    Li, Xinlu
    Xu, Mengfei
    Xu, Jiabo
    Weise, Thomas
    Zou, Le
    Sun, Fei
    Wu, Zhize
    [J]. IEEE ACCESS, 2020, 8 (08): : 142229 - 142242
  • [10] An attention-based approach to content-based image retrieval
    Bamidele, A
    Stentiford, FWM
    Morphett, J
    [J]. BT TECHNOLOGY JOURNAL, 2004, 22 (03) : 151 - 160