When CNN meet with ViT: decision-level feature fusion for camouflaged object detection

被引:0
|
作者
Yue, Guowen [1 ]
Jiao, Ge [1 ,2 ]
Li, Chen [1 ]
Xiang, Jiahao [1 ]
机构
[1] Hengyang Normal Univ, Coll Comp Sci & Technol, Hengyang 421002, Peoples R China
[2] Hengyang Normal Univ, Hunan Prov Key Lab Intelligent Informat Proc & App, Hengyang 421002, Peoples R China
来源
关键词
Convolutional neural network; Vision transformer; Camouflaged object detection; Feature fusion; NETWORK;
D O I
10.1007/s00371-024-03640-8
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Despite the significant advancements in camouflaged object detection achieved by convolutional neural network (CNN) methods and vision transformer (ViT) methods, both have limitations. CNN-based methods fail to explore long-range dependencies due to their limited receptive fields, while ViT-based methods lose detailed information due to large-span aggregation. To address these issues, we introduce a novel model, the double-extraction and triple-fusion network (DTNet), which leverages the global context modeling capabilities of ViT-based encoders and the detail capture capabilities of CNN-based encoders through decision-level feature fusion to make up the respective shortcomings for more complete segmentation of camouflaged objects. Specifically, it incorporates a boundary guidance module, designed to aggregate high-level and low-level boundary information through multi-scale feature decoding, thereby guiding the local detail representation of the transformer. It also includes a global context aggregation module, which shrinks the information of adjacent channels from top to bottom and aggregates information of high-level and low-level scales from bottom to top for feature decoding. It also contains a multi-feature fusion module to fuse global context features and local detail features. This module employs the attention mechanism in different channels to assign varying weights to long-range and short-range information. Through extensive experimentation, it has proven that the DTNet significantly surpasses 20 recently state-of-the-art methods in terms of performance. The related code and datasets will be posted at https://github.com/KungFuProgrammerle/DTNet.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Object Detection and Classification by Decision-Level Fusion for Intelligent Vehicle Systems
    Oh, Sang-Il
    Kang, Hang-Bong
    [J]. SENSORS, 2017, 17 (01)
  • [2] Decision-level fusion for vehicle detection
    Sun, Zehang
    Bebis, George
    Bourbakis, Nikolaos
    [J]. PROCEEDING OF THE 11TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS: COMPUTER SCIENCE AND TECHNOLOGY, VOL 4, 2007, : 622 - +
  • [3] Multisensor Decision-Level Fusion Network Based on Attention Mechanism for Object Detection
    Xu, Chengcheng
    Zhao, Haiyan
    Xie, Hongbin
    Gao, Bingzhao
    [J]. IEEE Sensors Journal, 2024, 24 (19) : 31466 - 31480
  • [4] Feature-level and decision-level fusion of noncoincidently sampled sensors for land mine detection
    Gunatilaka, AH
    Baertlein, BA
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (06) : 577 - 589
  • [5] Contextual feature fusion and refinement network for camouflaged object detection
    Yang, Jinyu
    Shi, Yanjiao
    Jiang, Ying
    Lu, Zixuan
    Yi, Yugen
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
  • [6] Boundary Guided Feature Fusion Network for Camouflaged Object Detection
    Qiu, Tianchi
    Li, Xiuhong
    Liu, Kangwei
    Li, Songlin
    Chen, Fan
    Zhou, Chenyu
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 433 - 444
  • [7] Boundary Feature Fusion and Foreground Guidance for Camouflaged Object Detection
    Liu, Wen-Xi
    Zhang, Jia-Bang
    Li, Yue-Zhou
    Lai, Yu
    Niu, Yu-Zhen
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (07): : 2279 - 2290
  • [8] Feature and decision-level fusion for schizophrenia detection based on resting-state fMRI data
    Algumaei, Ali H.
    Algunaid, Rami F.
    Rushdi, Muhammad A.
    Yassine, Inas A.
    [J]. PLOS ONE, 2022, 17 (05):
  • [9] Lightweight camouflaged object detection model based on multilevel feature fusion
    Qiaoyi Li
    Zhengjie Wang
    Xiaoning Zhang
    Hongbao Du
    [J]. Complex & Intelligent Systems, 2024, 10 : 4409 - 4419
  • [10] Decision-level feature switching as a paradigm for replay attack detection
    Saranya, M. S.
    Murthy, Hema A.
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 686 - 690