DAKRS: Domain Adaptive Knowledge-Based Retrieval System for Natural Language-Based Vehicle Retrieval

被引:0
|
作者
Ha, Synh Viet-Uyen [1 ]
Le, Huy Dinh-Anh
Nguyen, Quang Qui-Vinh
Chung, Nhat Minh
机构
[1] Ho Chi Minh City Int Univ VNU HCMIU, Vietnam Natl Univ, Ho Chi Minh City 700000, Vietnam
关键词
Contrastive representation learning; text-to-image retrieval; vehicle retrieval; semi-supervised learning; domain adaptation; background subtraction;
D O I
10.1109/ACCESS.2023.3260149
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given Natural Language (NL) text descriptions, NL-based vehicle retrieval aims to extract target vehicles from a multi-view multi-camera traffic video pool. Solutions to the problem have been challenged by not only inherent distinctions between textual and visual domains, but also by the complexities of the high-dimensionality of visual data, the diverse range of textual descriptions, a major lack of high-volume datasets in this relatively new field, alongside prominently large domain gaps between training and test sets. To deal with these issues, existing approaches have advocated computationally expensive models to separately extract the subspaces of language and vision before blending them into the same shared representation space. Through our proposed Domain Adaptive Knowledge-based Retrieval System (DAKRS), we show that by taking advantage of multi-modal information in a pretrained model, we can better focus on training robust representations in the shared space of limited labels, rather than on robust extraction of uni-modal representations that comes with increased computational burdens. Our contributions are threefold: (i) An efficient extension of Contrastive Language-Image Pre-training (CLIP)'s transfer learning into a baseline text-to-image multi-modular vehicle retrieval framework; (ii) A data enhancement method to create pseudo-vehicle tracks from the traffic video pool by leveraging the robustness of baseline retrieval model combined with background subtraction; and (iii) A Semi-Supervised Domain Adaptation (SSDA) scheme to engineer pseudo-labels for adapting model parameters to the target domain. Experimental results are benchmarked on Cityflow-NL to obtain 63.20% MRR with 150.0 M of parameters, illustrating our competitive effectiveness and efficiency against state-of-the-arts, without ensembling.
引用
收藏
页码:90951 / 90965
页数:15
相关论文
共 50 条
  • [1] A Multi-granularity Retrieval System for Natural Language-based Vehicle Retrieval
    Zhang, Jiacheng
    Lin, Xiangru
    Jiang, Minyue
    Yu, Yue
    Gong, Chenting
    Zhang, Wei
    Tan, Xiao
    Li, Yingying
    Ding, Errui
    Li, Guanbin
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3215 - 3224
  • [2] Connecting Language and Vision for Natural Language-Based Vehicle Retrieval
    Bai, Shuai
    Zheng, Zhedong
    Wang, Xiaohan
    Lin, Junyang
    Zhang, Zhu
    Zhou, Chang
    Yang, Hongxia
    Yang, Yi
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4029 - 4038
  • [3] Contrastive Learning for Natural Language-Based Vehicle Retrieval
    Tam Minh Nguyen
    Quang Huu Pham
    Linh Bao Doan
    Hoang Viet Trinh
    Viet-Anh Nguyen
    Viet-Hoang Phan
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4240 - 4247
  • [4] Exploring the Effect of Vehicle Appearance and Motion for Natural Language-Based Vehicle Retrieval
    Quang-Huy Can
    Hong-Quan Nguyen
    Thi-Ngoc-Diep Do
    Hoai Phan
    Thuy-Binh Nguyen
    Thi Thanh Thuy Pham
    Thanh-Hai Tran
    Thi-Lan Le
    [J]. RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, 2022, 1716 : 56 - 68
  • [5] Multi-level Matching of Natural Language-Based Vehicle Retrieval
    Liu, Ying
    Zhang, Zhongshuai
    Yang, Xiaochun
    [J]. WEB AND BIG DATA, PT III, APWEB-WAIM 2023, 2024, 14333 : 358 - 372
  • [6] OMG: Observe Multiple Granularities for Natural Language-Based Vehicle Retrieval
    Du, Yunhao
    Zhang, Binyu
    Ruan, Xiangning
    Su, Fei
    Zhao, Zhicheng
    Chen, Hong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3123 - 3132
  • [7] Towards Accurate Visual and Natural Language-Based Vehicle Retrieval Systems
    Khorramshahi, Pirazh
    Rambhatla, Sai Saketh
    Chellappa, Rama
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 4178 - 4187
  • [8] FindVehicle and VehicleFinder: a NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system
    Runwei Guan
    Ka Lok Man
    Feifan Chen
    Shanliang Yao
    Rongsheng Hu
    Xiaohui Zhu
    Jeremy Smith
    Eng Gee Lim
    Yutao Yue
    [J]. Multimedia Tools and Applications, 2024, 83 : 24841 - 24874
  • [9] FindVehicle and VehicleFinder: a NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system
    Guan, Runwei
    Man, Ka Lok
    Chen, Feifan
    Yao, Shanliang
    Hu, Rongsheng
    Zhu, Xiaohui
    Smith, Jeremy
    Lim, Eng Gee
    Yue, Yutao
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 24841 - 24874
  • [10] Tracked-Vehicle Retrieval by Natural Language Descriptions With Domain Adaptive Knowledge
    Huy Dinh-Anh Le
    Quang Qui-Vinh Nguyen
    Vuong Ai Nguyen
    Thong Duy-Minh Nguyen
    Nhat Minh Chung
    Tin-Trung Thai
    Synh Viet-Uyen Ha
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 3299 - 3308