Distilling object detectors with mask-guided feature and relation-based knowledge

被引:0
|
作者
Zeng, Liang [1 ]
Ma, Liyan [1 ]
Luo, Xiangfeng [1 ]
Guo, Yinsai [1 ]
Chen, Xue [1 ,2 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
[2] State Key Lab Math Engn & Adv Comp, Wuxi 214083, Peoples R China
基金
中国国家自然科学基金;
关键词
knowledge distillation; multi-value mask; object detection;
D O I
10.1504/IJCSE.2024.137291
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Knowledge distillation (KD) is an effective technique for network compression and model accuracy enhancement in image classification, semantic segmentation, pre-trained language model, and so on. However, existing KD methods are specialised for image classification and cannot be used effectively for object detection tasks, with the following two limitations: the imbalance of foreground and background instances and the neglect distillation of relation-based knowledge. In this paper, we present a general mask-guided feature and relation-based knowledge distillation framework (MAR) consisting of two components, mask-guided distillation, and relation-based distillation, to address the above problems. The mask-guided distillation is designed to emphasise students' learning of close-to-object features via multi-value masks, while relation-based distillation is proposed to mimic the relational information between different feature pixels on the classification head. Extensive experiments show that our methods achieve excellent AP improvements on both one-stage and two-stage detectors. Specifically, faster R-CNN with ResNet50 backbone achieves 40.6% in mAP under 1 x schedule on the COCO dataset, which is 3.2% higher than the baseline and even surpasses the teacher detector.
引用
收藏
页码:195 / 203
页数:10
相关论文
共 50 条
  • [41] ADOSMNet: a novel visual affordance detection network with object shape mask guided feature encoders
    Dongpan Chen
    Dehui Kong
    Jinghua Li
    Shaofan Wang
    Baocai Yin
    Multimedia Tools and Applications, 2024, 83 : 31629 - 31653
  • [42] ADOSMNet: a novel visual affordance detection network with object shape mask guided feature encoders
    Chen, Dongpan
    Kong, Dehui
    Li, Jinghua
    Wang, Shaofan
    Yin, Baocai
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (11) : 31629 - 31653
  • [43] MGA-YOLOv4: a multi-scale pedestrian detection method based on mask-guided attention
    Tingting Wang
    Liang Wan
    Lu Tang
    Mingsheng Liu
    Applied Intelligence, 2022, 52 : 15308 - 15324
  • [44] MGA-YOLOv4: a multi-scale pedestrian detection method based on mask-guided attention
    Wang, Tingting
    Wan, Liang
    Tang, Lu
    Liu, Mingsheng
    APPLIED INTELLIGENCE, 2022, 52 (13) : 15308 - 15324
  • [45] Mask-guided deep learning fishing net detection and recognition based on underwater range gated laser imaging
    Zhang, Yue
    Wang, Xinwei
    Sun, Liang
    Lei, Pingshun
    Chen, Jianan
    He, Jun
    Zhou, Yan
    Liu, Yuliang
    OPTICS AND LASER TECHNOLOGY, 2024, 171
  • [46] Prior object-knowledge sharpens properties of early visual feature-detectors
    Teufel, Christoph
    Dakin, Steven C.
    Fletcher, Paul C.
    SCIENTIFIC REPORTS, 2018, 8
  • [47] Prior object-knowledge sharpens properties of early visual feature-detectors
    Christoph Teufel
    Steven C. Dakin
    Paul C. Fletcher
    Scientific Reports, 8
  • [48] A Novel Deep Learning Framework Based Mask-Guided Attention Mechanism for Distant Metastasis Prediction of Lung Cancer
    Li, Zhe
    Wang, Shuo
    Yu, He
    Zhu, Yongbei
    Wu, Qingxia
    Wang, Liusu
    Wu, Zhangjie
    Gan, Yuncui
    Li, Weimin
    Qiu, Bensheng
    Tian, Jie
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (02): : 330 - 341
  • [49] Feature Aggregated Queries for Transformer-based Video Object Detectors
    Cui, Yiming
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6365 - 6376
  • [50] Video Object Segmentation Based on Guided Feature Transfer Learning
    Fiaz, Mustansar
    Mahmood, Arif
    Farooq, Sehar Shahzad
    Ali, Kamran
    Shaheryar, Muhammad
    Jung, Soon Ki
    FRONTIERS OF COMPUTER VISION (IW-FCV 2022), 2022, 1578 : 197 - 210