Multi-modal multi-task feature fusion for RGBT tracking

被引:17
|
作者
Cai, Yujue [1 ]
Sui, Xiubao [1 ]
Gu, Guohua [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210014, Peoples R China
基金
中国国家自然科学基金;
关键词
RGBT tracking; Auxiliary learning; Contrastive learning; Semantic matching; Instance segmentation; NETWORK;
D O I
10.1016/j.inffus.2023.101816
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGBT tracking has received more and more attention in recent years, and in this paper, we propose a multi-task auxiliary learning framework for RGBT tracking. Specifically, we simplify the tracking task to an instance classification task and make it the primary task of the framework. We designed three auxiliary tasks and used a hard-parameter sharing approach to jointly train multiple tasks, hoping that the primary task would benefit from them. The three auxiliary tasks are contrastive instance discrimination, one-shot instance segmentation, and instance semantic matching. The contrastive instance discrimination method promotes the classification process of the primary task by constraining the features in the representation space. One-shot instance segmentation trains the network in a weakly supervised way to focus on more fine-grained features. In addition, in order to make the network pay more attention to the invariant features of instance target during tracking, we introduce a semantic matching task to alleviate the model drift problem caused by time change. Based on the results on three RGBT tracking benchmarks, the proposed framework is not inferior to the state-of-the-art trackers.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Multi-Modal Fusion for Multi-Task Fuzzy Detection of Rail Anomalies
    Liyuan, Yang
    Osman, Ghazali
    Abdul Rahman, Safawi
    Mustapha, Muhammad Firdaus
    IEEE ACCESS, 2024, 12 : 73925 - 73935
  • [2] Large Margin Multi-Modal Multi-Task Feature Extraction for Image Classification
    Luo, Yong
    Wen, Yonggang
    Tao, Dacheng
    Gui, Jie
    Xu, Chao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (01) : 414 - 427
  • [3] Multi-task & Multi-modal Sentiment Analysis Model Based on Aware Fusion
    Wu S.
    Ma J.
    Data Analysis and Knowledge Discovery, 2023, 7 (10) : 74 - 84
  • [4] MultiMAE: Multi-modal Multi-task Masked Autoencoders
    Bachmann, Roman
    Mizrahi, David
    Atanov, Andrei
    Zamir, Amir
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 348 - 367
  • [5] Feature Disentanglement and Adaptive Fusion for Improving Multi-modal Tracking
    Li, Zheng
    Cai, Weibo
    Dong, Junhao
    Lai, Jianhuang
    Xie, Xiaohua
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 68 - 80
  • [6] MBFusion: Multi-modal balanced fusion and multi-task learning for cancer diagnosis and prognosis
    Zhang, Ziye
    Yin, Wendong
    Wang, Shijin
    Zheng, Xiaorou
    Dong, Shoubin
    Computers in Biology and Medicine, 2024, 181
  • [7] MM-HiFuse: multi-modal multi-task hierarchical feature fusion for esophagus cancer staging and differentiation classification
    Huo, Xiangzuo
    Tian, Shengwei
    Yu, Long
    Zhang, Wendong
    Li, Aolun
    Yang, Qimeng
    Song, Jinmiao
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
  • [8] MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving
    Chowdhuri, Sauhaarda
    Pankaj, Tushar
    Zipser, Karl
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1496 - 1504
  • [9] Multi-modal microblog classification via multi-task learning
    Sicheng Zhao
    Hongxun Yao
    Sendong Zhao
    Xuesong Jiang
    Xiaolei Jiang
    Multimedia Tools and Applications, 2016, 75 : 8921 - 8938
  • [10] Fast Multi-Task SCCA Learning with Feature Selection for Multi-Modal Brain Imaging Genetics
    Du, Lei
    Liu, Kefei
    Yao, Xiaohui
    Risacher, Shannon L.
    Han, Junwei
    Guo, Lei
    Saykin, Andrew J.
    Shen, Li
    Weiner, Michael
    Aisen, Paul
    Petersen, Ronald
    Jack, Clifford R., Jr.
    Jagust, William
    Trojanowki, John Q.
    Toga, Arthur W.
    Beckett, Laurel
    Green, Robert C.
    Saykin, Andrew J.
    Morris, John
    Liu, Enchi
    Montine, Tom
    Gamst, Anthony
    Thomas, Ronald G.
    Donohue, Michael
    Walter, Sarah
    Gessert, Devon
    Sather, Tamie
    Harvey, Danielle
    Kornak, John
    Dale, Anders
    Bernstein, Matthew
    Felmlee, Joel
    Fox, Nick
    Thompson, Paul
    Schuff, Norbert
    Alexander, Gene
    DeCarli, Charles
    Bandy, Dan
    Koeppe, Robert A.
    Foster, Norm
    Reiman, Eric M.
    Chen, Kewei
    Mathis, Chet
    Cairns, Nigel J.
    Taylor-Reinwald, Lisa
    Shaw, Les
    Lee, Virginia M. Y.
    Korecka, Magdalena
    Crawford, Karen
    Neu, Scott
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 356 - 361