EISNet: A Multi-Modal Fusion Network for Semantic Segmentation With Events and Images

被引:1
|
作者
Xie, Bochen [1 ]
Deng, Yongjian [2 ]
Shao, Zhanpeng [3 ]
Li, Youfu [1 ]
机构
[1] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China
[2] Beijing Univ Technol, Coll Comp Sci, Beijing 100124, Peoples R China
[3] Hunan Normal Univ, Coll Informat Sci & Engn, Changsha 410081, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Cameras; Standards; Visualization; Task analysis; Semantics; Noise measurement; Event camera; multi-modal fusion; attention mechanism; semantic segmentation; VISION;
D O I
10.1109/TMM.2024.3380255
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bio-inspired event cameras record a scene as sparse and asynchronous "events" by detecting per-pixel brightness changes. Such cameras show great potential in challenging scene understanding tasks, benefiting from the imaging advantages of high dynamic range and high temporal resolution. Considering the complementarity between event and standard cameras, we propose a multi-modal fusion network (EISNet) to improve the semantic segmentation performance. The key challenges of this topic lie in (i) how to encode event data to represent accurate scene information and (ii) how to fuse multi-modal complementary features by considering the characteristics of two modalities. To solve the first challenge, we propose an Activity-Aware Event Integration Module (AEIM) to convert event data into frame-based representations with high-confidence details via scene activity modeling. To tackle the second challenge, we introduce the Modality Recalibration and Fusion Module (MRFM) to recalibrate modal-specific representations and then aggregate multi-modal features at multiple stages. MRFM learns to generate modal-oriented masks to guide the merging of complementary features, achieving adaptive fusion. Based on these two core designs, our proposed EISNet adopts an encoder-decoder transformer architecture for accurate semantic segmentation using events and images. Experimental results show that our model outperforms state-of-the-art methods by a large margin on event-based semantic segmentation datasets.
引用
收藏
页码:8639 / 8650
页数:12
相关论文
共 50 条
  • [31] MMNet: Multi-modal multi-stage network for RGB-T image semantic segmentation
    Lan, Xin
    Gu, Xiaojing
    Gu, Xingsheng
    APPLIED INTELLIGENCE, 2022, 52 (05) : 5817 - 5829
  • [32] Multi-modal unsupervised domain adaptation for semantic image segmentation
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Sidibe, Desire
    PATTERN RECOGNITION, 2023, 137
  • [33] Multi-modal Prototypes for Open-World Semantic Segmentation
    Yang, Yuhuan
    Ma, Chaofan
    Ju, Chen
    Zhang, Fei
    Yao, Jiangchao
    Zhang, Ya
    Wang, Yanfeng
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 6004 - 6020
  • [34] An automatic fusion algorithm for multi-modal medical images
    Aktar, Mst. Nargis
    Lambert, Andrew J.
    Pickering, Mark
    COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION, 2018, 6 (05): : 584 - 598
  • [35] Ticino: A multi-modal remote sensing dataset for semantic segmentation
    Barbato, Mirko Paolo
    Piccoli, Flavio
    Napoletano, Paolo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [36] Recognition of multi-modal fusion images with irregular interference
    Wang, Yawei
    Chen, Yifei
    Wang, Dongfeng
    PeerJ Computer Science, 2022, 8
  • [37] A2FSeg: Adaptive Multi-modal Fusion Network for Medical Image Segmentation
    Wang, Zirui
    Hong, Yi
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 673 - 681
  • [38] MidFusNet: Mid-dense Fusion Network for Multi-modal Brain MRI Segmentation
    Duan, Wenting
    Zhang, Lei
    Colman, Jordan
    Gulli, Giosue
    Ye, Xujiong
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 : 102 - 114
  • [39] Self-Supervised Multi-Modal Hybrid Fusion Network for Brain Tumor Segmentation
    Fang, Feiyi
    Yao, Yazhou
    Zhou, Tao
    Xie, Guosen
    Lu, Jianfeng
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5310 - 5320
  • [40] Dual-Attention Deep Fusion Network for Multi-modal Medical Image Segmentation
    Zheng, Shenhai
    Ye, Xin
    Tan, Jiaxin
    Yang, Yifei
    Li, Laquan
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705