Connecting Language and Vision for Natural Language-Based Vehicle Retrieval

被引:10
|
作者
Bai, Shuai [1 ]
Zheng, Zhedong [2 ]
Wang, Xiaohan [3 ]
Lin, Junyang [1 ]
Zhang, Zhu [1 ]
Zhou, Chang [1 ]
Yang, Hongxia [1 ]
Yang, Yi [2 ]
机构
[1] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
[2] Univ Technol Sydney, ReLER Lab, Sydney, NSW, Australia
[3] Zhejiang Univ, Hangzhou, Peoples R China
关键词
D O I
10.1109/CVPRW53098.2021.00455
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vehicle search is one basic task for the efficient traffic management in terms of the AI City. Most existing practices focus on the image-based vehicle matching, including vehicle re-identification and vehicle tracking. In this paper, we apply one new modality, i.e., the language description, to search the vehicle of interest and explore the potential of this task in the real-world scenario. The natural language-based vehicle search poses one new challenge of fine-grained understanding of both vision and language modalities. To connect language and vision, we propose to jointly train the state-of-the-art vision models with the transformer-based language model in an end-to-end manner. Except for the network structure design and the training strategy, several optimization objectives are also revisited in this work. The qualitative and quantitative experiments verify the effectiveness of the proposed method. Our proposed method has achieved the 1st place on the 5th AI City Challenge, yielding competitive performance 18.69% MRR accuracy on the private test set. We hope this work can pave the way for the future study on using language description effectively and efficiently for real-world vehicle retrieval systems. The code will be available at https://github.com/ShuaiBai623/AIC2021-T5-CLV.
引用
收藏
页码:4029 / 4038
页数:10
相关论文
共 50 条
  • [1] Contrastive Learning for Natural Language-Based Vehicle Retrieval
    Tam Minh Nguyen
    Quang Huu Pham
    Linh Bao Doan
    Hoang Viet Trinh
    Viet-Anh Nguyen
    Viet-Hoang Phan
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4240 - 4247
  • [2] A Multi-granularity Retrieval System for Natural Language-based Vehicle Retrieval
    Zhang, Jiacheng
    Lin, Xiangru
    Jiang, Minyue
    Yu, Yue
    Gong, Chenting
    Zhang, Wei
    Tan, Xiao
    Li, Yingying
    Ding, Errui
    Li, Guanbin
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3215 - 3224
  • [3] Exploring the Effect of Vehicle Appearance and Motion for Natural Language-Based Vehicle Retrieval
    Quang-Huy Can
    Hong-Quan Nguyen
    Thi-Ngoc-Diep Do
    Hoai Phan
    Thuy-Binh Nguyen
    Thi Thanh Thuy Pham
    Thanh-Hai Tran
    Thi-Lan Le
    [J]. RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, 2022, 1716 : 56 - 68
  • [4] Multi-level Matching of Natural Language-Based Vehicle Retrieval
    Liu, Ying
    Zhang, Zhongshuai
    Yang, Xiaochun
    [J]. WEB AND BIG DATA, PT III, APWEB-WAIM 2023, 2024, 14333 : 358 - 372
  • [5] OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval
    Du, Yunhao
    Zhang, Binyu
    Ruan, Xiangning
    Su, Fei
    Zhao, Zhicheng
    Chen, Hong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3123 - 3132
  • [6] Towards Accurate Visual and Natural Language-Based Vehicle Retrieval Systems
    Khorramshahi, Pirazh
    Rambhatla, Sai Saketh
    Chellappa, Rama
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4178 - 4187
  • [7] Symmetric Network with Spatial Relationship Modeling for Natural Language-based Vehicle Retrieval
    Zhao, Chuyang
    Chen, Haobo
    Zhang, Wenyuan
    Chen, Junru
    Zhang, Sipeng
    Li, Yadong
    Li, Boxun
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3225 - 3232
  • [8] DAKRS: Domain Adaptive Knowledge-Based Retrieval System for Natural Language-Based Vehicle Retrieval
    Ha, Synh Viet-Uyen
    Le, Huy Dinh-Anh
    Nguyen, Quang Qui-Vinh
    Chung, Nhat Minh
    [J]. IEEE ACCESS, 2023, 11 : 90951 - 90965
  • [9] Natural Language-Based Vehicle Retrieval with Explicit Cross-Modal Representation Learning
    Xu, Bocheng
    Xiong, Yihua
    Zhang, Rui
    Feng, Yanyi
    Wu, Haifeng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3141 - 3148
  • [10] DUN: Dual-path Temporal Matching Network for Natural Language-based Vehicle Retrieval
    Sun, Ziruo
    Liu, Xinfang
    Bi, Xiaopeng
    Nie, Xiushan
    Yin, Yilong
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4056 - 4062