IntelPVT: intelligent patch-based pyramid vision transformers for object detection and classification

被引:1
|
作者
Nimma, Divya [1 ]
Zhou, Zhaoxian [1 ]
机构
[1] Univ Southern Mississippi, Sch Comp Sci & Comp Engn, 118 Coll Dr, Hattiesburg, MS 39406 USA
关键词
Vision transformer; Object detection; Object classification; Pyramid vision transformer; Adaptive patch; Intelligent method;
D O I
10.1007/s13042-023-01996-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since the advent of Transformers followed by Vision Transformers (ViTs), enormous success has been achieved by researchers in the field of computer vision and object detection. The difficulty mechanism of splitting images into fixed patches posed a serious challenge in this arena and resulted in loss of useful information at the time of object detection and classification. To overcome the challengers, we propose an innovative Intelligent-based patching mechanism and integrated it seamlessly into the conventional Patch-based ViT framework. The proposed method enables the utilization of patches with flexible sizes to capture and retain essential semantic content from input images and therefore increases the performance compared with conventional methods. Our method was evaluated with three renowned datasets Microsoft Common Objects in Context (MSCOCO-2017), Pascal VOC (Visual Object Classes Challenge) and Cityscapes upon object detection and classification. The experimental results showed promising improvements in specific metrics, particularly in higher confidence thresholds, making it a notable performer in object detection and classification tasks.
引用
收藏
页码:1767 / 1778
页数:12
相关论文
共 50 条
  • [1] IntelPVT: intelligent patch-based pyramid vision transformers for object detection and classification
    Divya Nimma
    Zhaoxian Zhou
    [J]. International Journal of Machine Learning and Cybernetics, 2024, 15 : 1767 - 1778
  • [2] IntelPVT: intelligent patch-based pyramid vision transformers for object detection and classification (Oct, 10.1007/s13042-023-01996-2, 2023)
    Nimma, Divya
    Zhou, Zhaoxian
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (07) : 3057 - 3057
  • [3] Experiments with patch-based object classification
    Wijnhoven, R. G. J.
    de With, P. H. N.
    [J]. 2007 IEEE CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE, 2007, : 105 - +
  • [4] Patch-Based Auxiliary Node Classification for Domain Adaptive Object Detection
    Qiu, Yuanyuan
    Xu, Zhijie
    Zhang, Jianqin
    [J]. ELECTRONICS, 2024, 13 (07)
  • [5] Patch-based Within-Object Classification
    Aghajanian, Jania
    Warrell, Jonathan
    Prince, Simon J. D.
    Li, Peng
    Rohn, Jennifer L.
    Baum, Buzz
    [J]. 2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 1125 - 1132
  • [6] Patch-based experiments with object classification in video surveillance
    Wijnhoven, Rob
    de With, Peter H. N.
    [J]. ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PROCEEDINGS, 2007, 4678 : 285 - 296
  • [7] Patch-based natural object detection using CF*IRF
    Jin, WJ
    Wang, RR
    Wu, LD
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1559 - 1562
  • [8] Feature Shrinkage Pyramid for Camouflaged Object Detection with Transformers
    Huang, Zhou
    Dai, Hang
    Xiang, Tian-Zhu
    Wang, Shuo
    Chen, Huai-Xin
    Qin, Jie
    Xiong, Huan
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5557 - 5566
  • [9] One-pass Keypoint Selection to Construct Codebook for Patch-based Object Classification
    Vinoharan, Veerapathirapillai
    Ramanan, Amirthalingam
    [J]. 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION FOR SUSTAINABILITY (ICIAFS' 2018), 2018,
  • [10] Weeds Classification with Deep Learning: An Investigation Using CNN, Vision Transformers, Pyramid Vision Transformers, and Ensemble Strategy
    Rozendo, Guilherme Botazzo
    Roberto, Guilherme Freire
    Zanchetta do Nascimento, Marcelo
    Neves, Leandro Alves
    Lumini, Alessandra
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I, 2024, 14469 : 229 - 243