Amodal instance segmentation with dual guidance from contextual and shape priors

被引:0
|
作者
Zhan, Jiao [1 ]
Luo, Yarong [1 ]
Guo, Chi [1 ,2 ]
Wu, Yejun [3 ]
Yang, Bohan [1 ]
Wang, Jingrong [1 ]
Liu, Jingnan [1 ]
机构
[1] Wuhan Univ, GNSS Res Ctr, Wuhan 430072, Hubei, Peoples R China
[2] Hubei Luojia Lab, Wuhan 430079, Hubei, Peoples R China
[3] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China
基金
中国博士后科学基金;
关键词
Instance segmentation; Amodal instance segmentation; Pixel affinity; Contextual dependency;
D O I
10.1016/j.asoc.2024.112602
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human perception possesses the remarkable ability to mentally reconstruct the complete structure of occluded objects, which has inspired researchers to pursue amodal instance segmentation for a more comprehensive understanding of the scene. Previous works have shown promising results, but they often capture the contextual dependencies in an unsupervised way, which can lead to undesirable contextual dependencies and unreasonable feature representations. To tackle this problem, we propose a Pixel Affinity-Parsing (PAP) module trained with the Pixel Affinity Loss (PAL). Embedded into CNN, the PAP module can leverage learned contextual priors to guide the network to explicitly distinguish different relationships between pixels, thus capturing the intraclass and inter-class contextual dependencies in a non-local and supervised way. This process helps to yield robust feature representations to prevent the network from misjudging. To demonstrate the effectiveness of the PAP module, we design an effective Pixel Affinity-Parsing Network (PAPNet). Notably, PAPNet also introduces shape priors to guide the amodal mask refinement process, thus preventing implausible shapes in the predicted masks. Consequently, with the dual guidance of contextual and shape priors, PAPNet can reconstruct the full shape of occluded objects accurately and reasonably. Experimental results demonstrate that the proposed PAPNet outperforms existing state-of-the-art methods on multiple amodal datasets. Specifically, on the KINS dataset, PAPNet achieves 37.1% AP, 60.6% AP50 and 39.8% AP75, surpassing C2F-Seg by 0.6%, 2.4% and 2.8%. On the D2SA dataset, PAPNet achieves 71.70% AP, 85.98% AP50 and 77.10% AP75, surpassing PGExp by 0.75% and 0.33% in AP50 and AP75, and being comparable to PGExp in AP. On the COCOA-cls dataset, PAPNet achieves 41.29% AP, 60.95% AP50 and 46.17% AP75, surpassing PGExp by 3.74%, 3.21% and 4.76%. On the CWALT dataset, PAPNet achieves 72.51% AP, 85.02% AP50 and 80.47% AP75, surpassing VRSPNet by 5.38%, 0.07% and 5.35%. The code is available at https://github.com/jiaoZ7688/PAP-Net.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Improved Sliding Shapes for Instance Segmentation of Amodal 3D Object
    Lin, Jinhua
    Yao, Yu
    Wang, Yanjie
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (11): : 5555 - 5567
  • [22] Reinforcement learning for instance segmentation with high-level priors
    Hilt, Paul
    Zarvandi, Maedeh
    Kaziakhmedov, Edgar
    Bhide, Sourabh
    Laptin, Maria
    Pape, Constantin
    Kreshuk, Anna
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3915 - 3924
  • [23] Predicting Future Instance Segmentation with Contextual Pyramid ConvLSTMs
    Sun, Jiangxin
    Xie, Jiafeng
    Hu, Jian-fang
    Lin, Zihang
    Lai, Jianhuang
    Zeng, Wenjun
    Zheng, Wei-shi
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2043 - 2051
  • [24] Segmentation of kidney from ultrasound images based on texture and shape priors
    Xie, J
    Jiang, YF
    Tsui, HT
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2005, 24 (01) : 45 - 57
  • [25] Interactive Lesion Segmentation with Shape Priors From Offline and Online Learning
    Shepherd, Tony
    Prince, Simon J. D.
    Alexander, Daniel C.
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2012, 31 (09) : 1698 - 1712
  • [26] Dual Mask Branches for Instance Segmentation
    Zhang, Xiaoliang
    Liu, Yuankun
    Li, Mao
    Zhang, Yuqing
    Chen, Changfeng
    Yin, Jie
    Zhou, Xiantao
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, ICDSP 2024, 2024, : 39 - 46
  • [27] K-convexity Shape Priors for Segmentation
    Isack, Hossam
    Gorelick, Lena
    Ng, Karin
    Veksler, Olga
    Boykov, Yuri
    COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 : 38 - 54
  • [28] Combining Shape Priors and MRF-Segmentation
    Flach, Boris
    Schlesinger, Dmitrij
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2008, 5342 : 177 - 186
  • [29] Deep Learning Shape Priors for Object Segmentation
    Chen, Fei
    Yu, Huimin
    Hu, Roland
    Zeng, Xunxun
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 1870 - 1877
  • [30] A Bayesian approach for image segmentation with shape priors
    Chang, Hang
    Yang, Qing
    Parvin, Bahram
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 687 - +