Towards Partial Supervision for Generic Object Counting in Natural Scenes

被引:3
|
作者
Cholakkal, Hisham [1 ]
Sun, Guolei [2 ]
Khan, Salman [1 ]
Khan, Fahad Shahbaz [1 ]
Shao, Ling [1 ,3 ]
Van Gool, Luc [2 ]
机构
[1] Mohamed Bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
[2] Swiss Fed Inst Technol, Comp Vis Lab, Zurich, Switzerland
[3] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
Visualization; Genomics; Bioinformatics; Image segmentation; Modulation; Sun; Graphical models; Generic object counting; reduced supervision; object localization; weakly supervised instance segmentation;
D O I
10.1109/TPAMI.2020.3021025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generic object counting in natural scenes is a challenging computer vision problem. Existing approaches either rely on instance-level supervision or absolute count information to train a generic object counter. We introduce a partially supervised setting that significantly reduces the supervision level required for generic object counting. We propose two novel frameworks, named lower-count (LC) and reduced lower-count (RLC), to enable object counting under this setting. Our frameworks are built on a novel dual-branch architecture that has an image classification and a density branch. Our LC framework reduces the annotation cost due to multiple instances in an image by using only lower-count supervision for all object categories. Our RLC framework further reduces the annotation cost arising from large numbers of object categories in a dataset by only using lower-count supervision for a subset of categories and class-labels for the remaining ones. The RLC framework extends our dual-branch LC framework with a novel weight modulation layer and a category-independent density map prediction. Experiments are performed on COCO, Visual Genome and PASCAL 2007 datasets. Our frameworks perform on par with state-of-the-art approaches using higher levels of supervision. Additionally, we demonstrate the applicability of our LC supervised density map for image-level supervised instance segmentation.
引用
收藏
页码:1604 / 1622
页数:19
相关论文
共 50 条
  • [31] Fast image-based object localization in natural scenes
    Hanek, R
    Schmitt, T
    Buck, S
    Beetz, M
    2002 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-3, PROCEEDINGS, 2002, : 116 - 122
  • [32] Object Segmentation Controls Image Reconstruction From Natural Scenes
    Neri, Peter
    PERCEPTION, 2016, 45 (06) : 687 - 687
  • [33] A Hierarchical Probabilistic Model for Rapid Object Categorization in Natural Scenes
    He, Xiaofu
    Yang, Zhiyong
    Tsien, Joe Z.
    PLOS ONE, 2011, 6 (05):
  • [34] Color contributes to object-contour perception in natural scenes
    Hansen, Thorsten
    Gegenfurtner, Karl R.
    JOURNAL OF VISION, 2017, 17 (03):
  • [35] Human-Object Interaction Prediction with Natural Language Supervision
    Li, Zhengxue
    An, Gaoyun
    2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 124 - 128
  • [36] Towards Categorization and Pose Estimation of Sets of Occluded Objects in Cluttered Scenes from Depth Data and Generic Object Models Using Joint Parsing
    Basevi, Hector
    Leonardis, Ales
    COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 665 - 681
  • [37] Towards a generic architecture for mobile object-oriented applications
    Haahr, M
    Cunningham, R
    Cahill, V
    2000 IEEE SERVICE PORTABILITY AND VIRTUAL CUSTOMER ENVIRONMENTS, 2001, : 91 - 96
  • [38] Towards Perspective-Free Object Counting with Deep Learning
    Onoro-Rubio, Daniel
    Lopez-Sastre, Roberto J.
    COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 615 - 629
  • [39] The properties of object representations constructed during visual search in natural scenes
    Inoue, Kazuya
    Takeda, Yuji
    VISUAL COGNITION, 2014, 22 (9-10) : 1135 - 1153
  • [40] Supervision by Fusion: Towards Unsupervised Learning of Deep Salient Object Detector
    Zhang, Dingwen
    Han, Junwei
    Zhang, Yu
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4068 - 4076