IntelPVT: intelligent patch-based pyramid vision transformers for object detection and classification

被引:1
|
作者
Nimma, Divya [1 ]
Zhou, Zhaoxian [1 ]
机构
[1] Univ Southern Mississippi, Sch Comp Sci & Comp Engn, 118 Coll Dr, Hattiesburg, MS 39406 USA
关键词
Vision transformer; Object detection; Object classification; Pyramid vision transformer; Adaptive patch; Intelligent method;
D O I
10.1007/s13042-023-01996-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since the advent of Transformers followed by Vision Transformers (ViTs), enormous success has been achieved by researchers in the field of computer vision and object detection. The difficulty mechanism of splitting images into fixed patches posed a serious challenge in this arena and resulted in loss of useful information at the time of object detection and classification. To overcome the challengers, we propose an innovative Intelligent-based patching mechanism and integrated it seamlessly into the conventional Patch-based ViT framework. The proposed method enables the utilization of patches with flexible sizes to capture and retain essential semantic content from input images and therefore increases the performance compared with conventional methods. Our method was evaluated with three renowned datasets Microsoft Common Objects in Context (MSCOCO-2017), Pascal VOC (Visual Object Classes Challenge) and Cityscapes upon object detection and classification. The experimental results showed promising improvements in specific metrics, particularly in higher confidence thresholds, making it a notable performer in object detection and classification tasks.
引用
收藏
页码:1767 / 1778
页数:12
相关论文
共 50 条
  • [21] Efficient and Adaptable Patch-Based Crack Detection
    Guo, Jing-Ming
    Markoni, Herleeyandi
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) : 21885 - 21896
  • [22] Patch-based topic model for group detection
    Mulin CHEN
    Qi WANG
    Xuelong LI
    [J]. Science China(Information Sciences), 2017, 60 (11) : 235 - 241
  • [23] Patch-based topic model for group detection
    Chen, Mulin
    Wang, Qi
    Li, Xuelong
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2017, 60 (11)
  • [24] Patch-based topic model for group detection
    Mulin Chen
    Qi Wang
    Xuelong Li
    [J]. Science China Information Sciences, 2017, 60
  • [25] PATCH-BASED SPARSE REPRESENTATION FOR BACTERIAL DETECTION
    Eldaly, A. K.
    Altmann, Y.
    Akram, A.
    Perperidis, A.
    Dhaliwal, K.
    McLaughlin, S.
    [J]. 2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 657 - 661
  • [26] Discriminatively trained patch-based model for occupant classification
    Huang, S. -S.
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2012, 6 (02) : 132 - 138
  • [27] Image patch-based method for automated classification and detection of focal liver lesions on CT
    Safdari, Mustafa
    Pasari, Raghav
    Rubin, Daniel
    Greenspan, Hayit
    [J]. MEDICAL IMAGING 2013: COMPUTER-AIDED DIAGNOSIS, 2013, 8670
  • [28] An Intelligent Forensics Approach for Detecting Patch-Based Image Inpainting
    Wang, Xinyi
    Wang, He
    Niu, Shaozhang
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [29] Multiscale patch-based feature graphs for image classification
    Todescato, Matheus V.
    Garcia, Luan F.
    Balreira, Dennis G.
    Carbonera, Joel L.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [30] Life regression based patch slimming for vision transformers
    Chen, Jiawei
    Chen, Lin
    Yang, Jiang
    Shi, Tianqi
    Cheng, Lechao
    Feng, Zunlei
    Song, Mingli
    [J]. NEURAL NETWORKS, 2024, 176