CF-DETR: Coarse-to-Fine Transformers for End-to-End Object Detection

被引:0
|
作者
Cao, Xipeng [1 ,2 ]
Yuan, Peng [2 ]
Feng, Bailan [2 ]
Niu, Kun [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[2] Huawei Noahs Ark Lab, Montreal, PQ, Canada
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recently proposed DEtection TRansformer (DETR) achieves promising performance for end-to-end object detection. However, it has relatively lower detection performance on small objects and suffers from slow convergence. This paper observed that DETR performs surprisingly well even on small objects when measuring Average Precision (AP) at decreased Intersection-over-Union (IoU) thresholds. Motivated by this observation, we propose a simple way to improve DETR by refining the coarse features and predicted locations. Specifically, we propose a novel Coarse-to-Fine (CF) decoder layer constituted of a coarse layer and a carefully designed fine layer. Within each CF decoder layer, the extracted local information (region of interest feature) is introduced into the flow of global context information from the coarse layer to refine and enrich the object query features via the fine layer. In the fine layer, the multi-scale information can be fully explored and exploited via the Adaptive Scale Fusion(ASF) module and Local Cross-Attention (LCA) module. The multi-scale information can also be enhanced by another proposed Transformer Enhanced FPN (TER) module to further improve the performance. With our proposed framework (named CF-DETR), the localization accuracy of objects (especially for small objects) can be largely improved. As a byproduct, the slow convergence issue of DETR can also be addressed. The effectiveness of CF-DETR is validated via extensive experiments on the coco benchmark. CF-DETR achieves state-of-the-art performance among end-to-end detectors, e.g., achieving 47.8 AP using ResNet-50 with 36 epochs in the standard 3x training schedule.
引用
收藏
页码:185 / 193
页数:9
相关论文
共 50 条
  • [41] End-to-End Object Detection with Fully Convolutional Network
    Wang, Jianfeng
    Song, Lin
    Li, Zeming
    Sun, Hongbin
    Sun, Jian
    Zheng, Nanning
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15844 - 15853
  • [42] SRDD: a lightweight end-to-end object detection with transformer
    Zhu, Yuan
    Xia, Qingyuan
    Jin, Wen
    CONNECTION SCIENCE, 2022, 34 (01) : 2448 - 2465
  • [43] Progressive End-to-End Object Detection in Crowded Scenes
    Zheng, Anlin
    Zhang, Yuang
    Zhang, Xiangyu
    Qi, Xiaojuan
    Sun, Jian
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 847 - 856
  • [44] Toward End-to-End Object Detection and Tracking on the Edge
    Tabkhi, Hamed
    SEC 2017: 2017 THE SECOND ACM/IEEE SYMPOSIUM ON EDGE COMPUTING (SEC'17), 2017,
  • [45] Dense Distinct Query for End-to-End Object Detection
    Zhang, Shilong
    Wang, Xinjiang
    Wang, Jiaqi
    Pang, Jiangmiao
    Lyu, Chengqi
    Zhang, Wenwei
    Luo, Ping
    Chen, Kai
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7329 - 7338
  • [46] End-to-End Edge Neuromorphic Object Detection System
    Silva, D. A.
    Shymyrbay, A.
    Smagulova, K.
    Elsheikh, A.
    Fouda, M. E.
    Eltawil, A. M.
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 194 - 198
  • [47] DCEA: DETR With Concentrated Deformable Attention for End-to-End Ship Detection in SAR Images
    Lin, Hai
    Liu, Jin
    Li, Xingye
    Wei, Lai
    Liu, Yuxin
    Han, Bing
    Wu, Zhongdai
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 17292 - 17307
  • [48] Sparse Block DETR: Precise and Speedy End-to-End Detector for PCB Defect Detection
    Hong, JiXuan
    Xie, JingJing
    Yang, ChenHui
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT I, 2023, 14254 : 281 - 292
  • [49] End-to-end pest detection on an improved deformable DETR with multihead criss cross attention
    Qi, Fang
    Chen, Gangming
    Liu, Jieyuan
    Tang, Zhe
    ECOLOGICAL INFORMATICS, 2022, 72
  • [50] End-to-end Symbolic Regression with Transformers
    Kamienny, Pierre-Alexandre
    d'Ascoli, Stephane
    Lample, Guillaume
    Charton, Francois
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,