Dense Distinct Query for End-to-End Object Detection

被引:106
|
作者
Zhang, Shilong [1 ,3 ]
Wang, Xinjiang [2 ]
Wang, Jiaqi [1 ]
Pang, Jiangmiao [1 ]
Lyu, Chengqi [1 ]
Zhang, Wenwei [1 ,4 ]
Luo, Ping [1 ,3 ]
Chen, Kai [1 ]
机构
[1] Shanghai AI Lab, Shanghai, Peoples R China
[2] SenseTime Res, Hong Kong, Peoples R China
[3] Univ Hong Kong, Hong Kong, Peoples R China
[4] Nanyang Technol Univ, S Lab, Singapore, Singapore
基金
国家重点研发计划;
关键词
D O I
10.1109/CVPR52729.2023.00708
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One-to-one label assignment in object detection has successfully obviated the need for non-maximum suppression (NMS) as postprocessing and makes the pipeline end-to-end. However, it triggers a new dilemma as the widely used sparse queries cannot guarantee a high recall, while dense queries inevitably bring more similar queries and encounter optimization difficulties. As both sparse and dense queries are problematic, then what are the expected queries in end-to-end object detection? This paper shows that the solution should be Dense Distinct Queries (DDQ). Concretely, we first lay dense queries like traditional detectors and then select distinct ones for one-to-one assignments. DDQ blends the advantages of traditional and recent end-to-end detectors and significantly improves the performance of various detectors including FCN, R-CNN, and DETRs. Most impressively, DDQ-DETR achieves 52.1 AP on MS-COCO dataset within 12 epochs using a ResNet-50 backbone, outperforming all existing detectors in the same setting. DDQ also shares the benefit of end-to-end detectors in crowded scenes and achieves 93.8 AP on Crowd-Human. We hope DDQ can inspire researchers to consider the complementarity between traditional methods and end-to-end detectors. The source code can be found at https://github.com/jshilong/DDQ.
引用
收藏
页码:7329 / 7338
页数:10
相关论文
共 50 条
  • [21] DHLA: Dynamic Hybrid Label Assignment for End-to-End Object Detection
    Hu, Zhiliang
    Chen, Si
    Hua, Yang
    Wang, Da-Han
    Zhu, Shunzhi
    Yan, Yan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1055 - 1069
  • [22] An End-to-End Cascaded Image Deraining and Object Detection Neural Network
    Wang, Kaige
    Wang, Tianming
    Qu, Jianchuang
    Jiang, Huatao
    Li, Qing
    Chang, Lin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 9541 - 9548
  • [23] SparseDet: Towards End-to-End 3D Object Detection
    Han, Jianhong
    Wan, Zhaoyi
    Liu, Zhe
    Feng, Jie
    Zhou, Bingfeng
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 781 - 792
  • [24] Towards Precise End-to-end Weakly Supervised Object Detection Network
    Yang, Ke
    Li, Dongsheng
    Dou, Yong
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8371 - 8380
  • [25] End-to-End Video Object Detection with Spatial-Temporal Transformers
    He, Lu
    Zhou, Qianyu
    Li, Xiangtai
    Niu, Li
    Cheng, Guangliang
    Li, Xiao
    Liu, Wenxuan
    Tong, Yunhai
    Ma, Lizhuang
    Zhang, Liqing
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1507 - 1516
  • [26] An End-to-End Transformer Model for 3D Object Detection
    Misra, Ishan
    Girdhar, Rohit
    Joulin, Armand
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2886 - 2897
  • [27] AN END-TO-END SCALABLE OBJECT DETECTION NETWORK FOR REMOTE SENSING IMAGES
    Duan, Yani
    Teng, Zhu
    Zhang, Baopeng
    Fan, Jianping
    IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 960 - 963
  • [28] Sparse R-CNN: An End-to-End Framework for Object Detection
    Sun, Peize
    Zhang, Rufeng
    Jiang, Yi
    Kong, Tao
    Xu, Chenfeng
    Zhan, Wei
    Tomizuka, Masayoshi
    Yuan, Zehuan
    Luo, Ping
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15650 - 15664
  • [29] CPH DETR: Comprehensive Regression Loss for End-to-End Object Detection
    Wu, Jihao
    Li, Shufang
    Kang, Guxia
    Yang, Yuqing
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT II, 2024, 15017 : 93 - 107
  • [30] AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection
    Thanh-Toan Do
    Anh Nguyen
    Reid, Ian
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 5882 - 5889