Consensus Feature Network for Scene Parsing

被引:1
|
作者
Wu, Tianyi [1 ,2 ]
Tang, Sheng [3 ,4 ]
Zhang, Rui [3 ,4 ]
Guo, Guodong [1 ,2 ]
机构
[1] Inst Deep Learning, Baidu Res, Beijing 100085, Peoples R China
[2] Natl Engn Lab Deep Learning Technol & Applicat, Beijing 100085, Peoples R China
[3] Chinese Acad Sci, Insititue Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[4] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Transforms; Semantics; Convolution; Feature extraction; Training; Network architecture; Information and communication technology; Scene Parsing; Instance Consensus Transform; Category Consensus Transform; SEGMENTATION; IMAGES;
D O I
10.1109/TMM.2021.3094333
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scene parsing is challenging as it aims to assign one of the semantic categories to each pixel in scene images. Thus, pixel-level features are desired for scene parsing. However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category. To address this problem, we propose two transform units to learn pixel-level consensus features. One is an Instance Consensus Transform (ICT) unit to learn the instance-level consensus features by aggregating features within the same instance. The other is a Category Consensus Transform (CCT) unit to pursue category-level consensus features through keeping the consensus of features among instances of the same category in scene images. The proposed ICT and CCT units are lightweight, data-driven and end-to-end trainable. The features learned by the two units are more coherent in both instance-level and category-level. Furthermore, we present the Consensus Feature Network (CFNet) based on the proposed ICT and CCT units, and demonstrate the effectiveness of each component in our method by performing extensive ablation experiments. Finally, our proposed CFNet achieves competitive performance on four datasets, including Cityscapes, Pascal Context, CamVid, and COCO Stuff.
引用
收藏
页码:3208 / 3217
页数:10
相关论文
共 50 条
  • [1] Pyramid Scene Parsing Network
    Zhao, Hengshuang
    Shi, Jianping
    Qi, Xiaojuan
    Wang, Xiaogang
    Jia, Jiaya
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6230 - 6239
  • [2] EFRNet: Efficient Feature Reconstructing Network for Real-Time Scene Parsing
    Li, Xin
    Yang, Fan
    Luo, Ao
    Jiao, Zhicheng
    Cheng, Hong
    Liu, Zicheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2852 - 2865
  • [3] FRNet: Feature Reconstruction Network for RGB-D Indoor Scene Parsing
    Zhou, Wujie
    Yang, Enquan
    Lei, Jingsheng
    Yu, Lu
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (04) : 677 - 687
  • [4] Video Scene Parsing with Predictive Feature Learning
    Jin, Xiaojie
    Li, Xin
    Xiao, Huaxin
    Shen, Xiaohui
    Lin, Zhe
    Yang, Jimei
    Chen, Yunpeng
    Dong, Jian
    Liu, Luoqi
    Jie, Zequn
    Feng, Jiashi
    Yan, Shuicheng
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5581 - 5589
  • [5] Feature boosting with efficient attention for scene parsing
    Singh, Vivek
    Sharma, Shailza
    Cuzzolin, Fabio
    NEUROCOMPUTING, 2024, 601
  • [6] Adaptive Context Network for Scene Parsing
    Fu, Jun
    Liu, Jing
    Wang, Yuhang
    Li, Yong
    Bao, Yongjun
    Tang, Jinhui
    Lu, Hanqing
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6747 - 6756
  • [7] HSNet: hierarchical semantics network for scene parsing
    Tan, Xin
    Xu, Jiachen
    Cao, Ying
    Xu, Ke
    Ma, Lizhuang
    Lau, Rynson W. H.
    VISUAL COMPUTER, 2023, 39 (07): : 2543 - 2554
  • [8] HSNet: hierarchical semantics network for scene parsing
    Xin Tan
    Jiachen Xu
    Ying Cao
    Ke Xu
    Lizhuang Ma
    Rynson W. H. Lau
    The Visual Computer, 2023, 39 : 2543 - 2554
  • [9] SPNet: Superpixel Pyramid Network for Scene Parsing
    Xu, Bingbing
    Yang, Fei
    Yang, Jinfu
    Wu, Suishuo
    Shan, Yi
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 3690 - 3695
  • [10] Fully Contextual Network for Hyperspectral Scene Parsing
    Wang, Di
    Du, Bo
    Zhang, Liangpei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60