Towards Interpretable Object Detection by Unfolding Latent Structures

被引:14
|
作者
Wu, Tianfu [1 ,2 ]
Song, Xi
机构
[1] NC State Univ, Dept ECE, Raleigh, NC 27695 USA
[2] NC State Univ, Visual Narrat Initiat, Raleigh, NC 27695 USA
关键词
D O I
10.1109/ICCV.2019.00613
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper first proposes a method of formulating model interpretability in visual understanding tasks based on the idea of unfolding latent structures. It then presents a case study in object detection using popular two-stage regionbased convolutional network (i.e., R-CNN) detection systems [19, 50, 7, 23]. The proposed method focuses on weakly-supervised extractive rationale generation, that is learning to unfold latent discriminative part configurations of object instances automatically and simultaneously in detection without using any supervision for part configurations. It utilizes a top-down hierarchical and compositional grammar model embedded in a directed acyclic AND-OR Graph (AOG) to explore and unfold the space of latent part configurations of regions of interest (RoIs). It presents an AOGParsing operator that seamlessly integrates with the RoIPooling [19]/RoIAlign [23] operator widely used in R-CNN and is trained end-to-end. In object detection, a bounding box is interpreted by the best parse tree derived from the AOG on-the-fly, which is treated as the qualitatively extractive rationale generated for interpreting detection. In experiments, Faster R-CNN [50] is used to test the proposed method on the PASCAL VOC 2007 [13] and the COCO 2017 [40] object detection datasets. The experimental results show that the proposed method can compute promising latent structures without hurting the performance. The code and pretrained models are available at https://github.com/ iVMCL/iRCNN.
引用
收藏
页码:6032 / 6042
页数:11
相关论文
共 50 条
  • [1] Towards Real Time Interpretable Object Detection for UAV Platform by Saliency Maps
    Hogan, Maxwell
    Aouf, Nabil
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE-ROBIO 2021), 2021, : 1178 - 1183
  • [2] TOWARDS HUMAN-LIKE INTERPRETABLE OBJECT DETECTION VIA SPATIAL RELATION ENCODING
    Kim, Jung Uk
    Park, Sungjune
    Ro, Yong Man
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 3284 - 3288
  • [3] Interpretable Networks for Hyperspectral Anomaly Detection: A Deep Unfolding Solution
    Li, Chenyu
    Zhang, Bing
    Hong, Danfeng
    Yao, Jing
    Jia, Xiuping
    Plaza, Antonio
    Chanussot, Jocelyn
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [4] Towards Interpretable Video Anomaly Detection
    Doshi, Keval
    Yilmaz, Yasin
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2654 - 2663
  • [5] Towards Interpretable Natural Language Understanding with Explanations as Latent Variables
    Zhou, Wangchunshu
    Hu, Jinyi
    Zhang, Hanlin
    Liang, Xiaodan
    Sun, Maosong
    Xiong, Chenyan
    Tang, Jian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [6] BLasso for object categorization and retrieval: Towards interpretable visual models
    Rebai, Ahmed
    Joly, Alexis
    Boujemaa, Nozha
    PATTERN RECOGNITION, 2012, 45 (06) : 2377 - 2389
  • [7] Latent Hough Transform for Object Detection
    Razavi, Nima
    Gall, Juergen
    Kohli, Pushmeet
    van Gool, Luc
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 : 312 - 325
  • [8] LRR-Net: An Interpretable Deep Unfolding Network for Hyperspectral Anomaly Detection
    Li, Chenyu
    Zhang, Bing
    Hong, Danfeng
    Yao, Jing
    Chanussot, Jocelyn
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [9] Towards Dependable Object Detection
    Selvaraj, Nithish Muthuchamy
    Muhammad, Ilyas
    Cheah, Chien Chern
    IECON 2020: THE 46TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2020, : 523 - 528
  • [10] The object detection logic of latent variable technologies
    Michael Maraun
    Quality & Quantity, 2017, 51 : 239 - 259