Consensus Feature Network for Scene Parsing

被引:1
|
作者
Wu, Tianyi [1 ,2 ]
Tang, Sheng [3 ,4 ]
Zhang, Rui [3 ,4 ]
Guo, Guodong [1 ,2 ]
机构
[1] Inst Deep Learning, Baidu Res, Beijing 100085, Peoples R China
[2] Natl Engn Lab Deep Learning Technol & Applicat, Beijing 100085, Peoples R China
[3] Chinese Acad Sci, Insititue Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[4] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Transforms; Semantics; Convolution; Feature extraction; Training; Network architecture; Information and communication technology; Scene Parsing; Instance Consensus Transform; Category Consensus Transform; SEGMENTATION; IMAGES;
D O I
10.1109/TMM.2021.3094333
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scene parsing is challenging as it aims to assign one of the semantic categories to each pixel in scene images. Thus, pixel-level features are desired for scene parsing. However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category. To address this problem, we propose two transform units to learn pixel-level consensus features. One is an Instance Consensus Transform (ICT) unit to learn the instance-level consensus features by aggregating features within the same instance. The other is a Category Consensus Transform (CCT) unit to pursue category-level consensus features through keeping the consensus of features among instances of the same category in scene images. The proposed ICT and CCT units are lightweight, data-driven and end-to-end trainable. The features learned by the two units are more coherent in both instance-level and category-level. Furthermore, we present the Consensus Feature Network (CFNet) based on the proposed ICT and CCT units, and demonstrate the effectiveness of each component in our method by performing extensive ablation experiments. Finally, our proposed CFNet achieves competitive performance on four datasets, including Cityscapes, Pascal Context, CamVid, and COCO Stuff.
引用
收藏
页码:3208 / 3217
页数:10
相关论文
共 50 条
  • [21] Scene Parsing Using Fully Convolutional Network for Semantic Segmentation
    Ali, Nisar
    Ijaz, Ali Zeeshan
    Ali, Raja Hashim
    Ul Abideen, Zain
    Bais, Abdul
    2023 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE, 2023,
  • [22] Global-Guided Selective Context Network for Scene Parsing
    Jiang, Jie
    Liu, Jing
    Fu, Jun
    Zhu, Xinxin
    Li, Zechao
    Lu, Hanqing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1752 - 1764
  • [23] Research on scene parsing algorithm cascading object detection network
    Guo, Xi
    Wen, Yuanzhen
    Ma, Dongyuan
    Jin, Yuhui
    Yu, Haitao
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE) AND IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC), VOL 1, 2017, : 459 - 464
  • [24] Global enhancement network underwater archaeology scene parsing method
    Pan, Junyan
    Jia, Jishen
    Cai, Lei
    ROBOTICA, 2023, 41 (12) : 3541 - 3564
  • [25] Re:PolyWorld - A Graph Neural Network for Polygonal Scene Parsing
    Zorzi, Stefano
    Fraundorfer, Friedrich
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16716 - 16725
  • [26] High resolution scene parsing network based on semantic segmentation
    Shi Jian-Feng
    Xang Ning
    Wang A-Chuan
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2022, 37 (12) : 1598 - 1606
  • [27] Horizon detection in maritime images using scene parsing network
    Jeong, C. Y.
    Yang, H. S.
    Moon, K. D.
    ELECTRONICS LETTERS, 2018, 54 (12) : 760 - 761
  • [28] Semantic combined network for zero-shot scene parsing
    Wang, Yinduo
    Zhang, Haofeng
    Wang, Shidong
    Long, Yang
    Yang, Longzhi
    IET IMAGE PROCESSING, 2020, 14 (04) : 757 - 765
  • [29] Dense feature pyramid network for cartoon dog parsing
    Wan, Jerome
    Mougeot, Guillaume
    Yang, Xubo
    VISUAL COMPUTER, 2020, 36 (10-12): : 2471 - 2483
  • [30] Dense feature pyramid network for cartoon dog parsing
    Jerome Wan
    Guillaume Mougeot
    Xubo Yang
    The Visual Computer, 2020, 36 : 2471 - 2483