Towards Feasible Capsule Network for Vision Tasks

被引:0
|
作者
Vu, Dang Thanh [1 ]
An, Le Bao Thai [1 ]
Kim, Jin Young [1 ]
Yu, Gwang Hyun [1 ]
Ferrari, Gianluigi
机构
[1] Chonnam Natl Univ, Dept ICT Convergence Syst Engn, 77 Yongbong Ro, Gwangju 61186, South Korea
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 18期
关键词
capsule network; computer vision; equivariance; segmentation; pre-trained model;
D O I
10.3390/app131810339
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Capsule networks exhibit the potential to enhance computer vision tasks through their utilization of equivariance for capturing spatial relationships. However, the broader adoption of these networks has been impeded by the computational complexity of their routing mechanism and shallow backbone model. To address these challenges, this paper introduces an innovative hybrid architecture that seamlessly integrates a pretrained backbone model with a task-specific capsule head (CapsHead). Our methodology is extensively evaluated across a range of classification and segmentation tasks, encompassing diverse datasets. The empirical findings robustly underscore the efficacy and practical feasibility of our proposed approach in real-world vision applications. Notably, our approach yields substantial 3.45% and 6.24% enhancement in linear evaluation on the CIFAR10 dataset and segmentation on the VOC2012 dataset, respectively, compared to baselines that do not incorporate the capsule head. This research offers a noteworthy contribution by not only advancing the application of capsule networks, but also mitigating their computational complexities. The results substantiate the feasibility of our hybrid architecture, thereby paving the way for a wider integration of capsule networks into various computer vision tasks.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] Surrogate Contrastive Network for Supervised Band Selection in Multispectral Computer Vision Tasks
    Bernal, Edgar A.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 998 - 1006
  • [12] Multimodal high-order relational network for vision-and-language tasks
    Pan, Hao
    Huang, Jun
    NEUROCOMPUTING, 2022, 492 : 62 - 75
  • [13] Research progress of computer vision tasks based on deep learning and SAE network
    Ling, Shijia
    Yi, Qiaoling
    Lan, Banru
    Liu, Liangfang
    APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2023, 8 (02) : 985 - 994
  • [14] Flow Algebra: Towards an Efficient, Unifying Framework for Network Management Tasks
    Leet, Christopher
    Soule, Robert
    Yang, Yang Richard
    Zhang, Ying
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [15] Towards Executing Computer Vision Functionality on Programmable Network Devices
    Glebke, Rene
    Krude, Johannes
    Kunze, Ike
    Ruth, Jan
    Senger, Felix
    Wehrle, Klaus
    PROCEEDINGS OF THE 1ST ACM CONEXT WORKSHOP ON EMERGING IN-NETWORK COMPUTING PARADIGMS (ENCP '19), 2019, : 15 - 20
  • [16] Towards Multi-Interest Pre-training with Sparse Capsule Network
    Tang, Zuoli
    Wang, Lin
    Zou, Lixin
    Zhang, Xiaolu
    Zhou, Jun
    Li, Chenliang
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 311 - 320
  • [17] Towards the characterization of representations learned via capsule-based network architectures
    Tawalbeh, Saja
    Oramas, Jose
    NEUROCOMPUTING, 2025, 617
  • [18] Performing tasks with peripheral vision
    Woods, Russell L.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2018, 59 (09)
  • [19] Towards Providing Low-Risk and Economically Feasible Network Data Transfer Services
    Andreica, Mugurel Ionut
    Deac, Vasile
    Tipa, Stelian
    PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON SIGNALS, SPEECH AND IMAGE PROCESSING/9TH WSEAS INTERNATIONAL CONFERENCE ON MULTIMEDIA, INTERNET & VIDEO TECHNOLOGIES, 2009, : 204 - +
  • [20] Competitiveness - 2015, vision and tasks
    Vertes, Andras
    Viszt, Erzsebet
    PUBLIC FINANCE QUARTERLY-HUNGARY, 2007, 52 (3-4): : 488 - 513