Towards Feasible Capsule Network for Vision Tasks

被引:0
|
作者
Vu, Dang Thanh [1 ]
An, Le Bao Thai [1 ]
Kim, Jin Young [1 ]
Yu, Gwang Hyun [1 ]
Ferrari, Gianluigi
机构
[1] Chonnam Natl Univ, Dept ICT Convergence Syst Engn, 77 Yongbong Ro, Gwangju 61186, South Korea
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 18期
关键词
capsule network; computer vision; equivariance; segmentation; pre-trained model;
D O I
10.3390/app131810339
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Capsule networks exhibit the potential to enhance computer vision tasks through their utilization of equivariance for capturing spatial relationships. However, the broader adoption of these networks has been impeded by the computational complexity of their routing mechanism and shallow backbone model. To address these challenges, this paper introduces an innovative hybrid architecture that seamlessly integrates a pretrained backbone model with a task-specific capsule head (CapsHead). Our methodology is extensively evaluated across a range of classification and segmentation tasks, encompassing diverse datasets. The empirical findings robustly underscore the efficacy and practical feasibility of our proposed approach in real-world vision applications. Notably, our approach yields substantial 3.45% and 6.24% enhancement in linear evaluation on the CIFAR10 dataset and segmentation on the VOC2012 dataset, respectively, compared to baselines that do not incorporate the capsule head. This research offers a noteworthy contribution by not only advancing the application of capsule networks, but also mitigating their computational complexities. The results substantiate the feasibility of our hybrid architecture, thereby paving the way for a wider integration of capsule networks into various computer vision tasks.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] ViT-MVT: A Unified Vision Transformer Network for Multiple Vision Tasks
    Xie, Tao
    Dai, Kun
    Jiang, Zhiqiang
    Li, Ruifeng
    Mao, Shouren
    Wang, Ke
    Zhao, Lijun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 15
  • [2] Hyneter:Hybrid Network Transformer for Multiple Computer Vision Tasks
    Chen, Dong
    Miao, Duoqian
    Zhao, Xuerong
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (06) : 8773 - 8785
  • [3] A FEASIBLE VISION OF SOCIALISM
    HEILBRONER, R
    DISSENT, 1983, 30 (04) : 473 - 476
  • [4] Diabetic retinopathy prediction based on vision transformer and modified capsule network
    Oulhadj M.
    Riffi J.
    Khodriss C.
    Mahraz A.M.
    Yahyaouy A.
    Abdellaoui M.
    Andaloussi I.B.
    Tairi H.
    Computers in Biology and Medicine, 2024, 175
  • [5] Capsule endoscopy is feasible in small children
    Aabakken, L
    Scholz, T
    Ostensen, AB
    Emblem, R
    Jermstad, T
    ENDOSCOPY, 2003, 35 (09) : 798 - 798
  • [6] Patch-based Privacy Preserving Neural Network for Vision Tasks
    Mabuchi, Mitsuhiro
    Ishikawa, Tetsuya
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 1550 - 1559
  • [7] Active network vision and reality: lessons from a capsule-based system
    Wetherall, D
    OPERATING SYSTEMS REVIEW, VOL 33, NO 5, DECEMBER 1999, 1999, : 64 - 79
  • [8] Active network vision and reality: Lessons from a capsule-based system
    Wetherall, David
    Operating Systems Review (ACM), 1999, 33 (05): : 64 - 79
  • [9] Active network vision and reality: Lessons from a capsule-based system
    Wetherall, D
    DARPA ACTIVE NETWORKS CONFERENCE AND EXPOSITION, PROCEEDINGS, 2002, : 25 - 40
  • [10] Is esophageal capsule endoscopy feasible?: Results of a pilot
    Neu, B
    Wettschureck, E
    Rösch, T
    ENDOSCOPY, 2003, 35 (11) : 957 - 961