Consensus Feature Network for Scene Parsing

被引:1
|
作者
Wu, Tianyi [1 ,2 ]
Tang, Sheng [3 ,4 ]
Zhang, Rui [3 ,4 ]
Guo, Guodong [1 ,2 ]
机构
[1] Inst Deep Learning, Baidu Res, Beijing 100085, Peoples R China
[2] Natl Engn Lab Deep Learning Technol & Applicat, Beijing 100085, Peoples R China
[3] Chinese Acad Sci, Insititue Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[4] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Transforms; Semantics; Convolution; Feature extraction; Training; Network architecture; Information and communication technology; Scene Parsing; Instance Consensus Transform; Category Consensus Transform; SEGMENTATION; IMAGES;
D O I
10.1109/TMM.2021.3094333
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scene parsing is challenging as it aims to assign one of the semantic categories to each pixel in scene images. Thus, pixel-level features are desired for scene parsing. However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category. To address this problem, we propose two transform units to learn pixel-level consensus features. One is an Instance Consensus Transform (ICT) unit to learn the instance-level consensus features by aggregating features within the same instance. The other is a Category Consensus Transform (CCT) unit to pursue category-level consensus features through keeping the consensus of features among instances of the same category in scene images. The proposed ICT and CCT units are lightweight, data-driven and end-to-end trainable. The features learned by the two units are more coherent in both instance-level and category-level. Furthermore, we present the Consensus Feature Network (CFNet) based on the proposed ICT and CCT units, and demonstrate the effectiveness of each component in our method by performing extensive ablation experiments. Finally, our proposed CFNet achieves competitive performance on four datasets, including Cityscapes, Pascal Context, CamVid, and COCO Stuff.
引用
收藏
页码:3208 / 3217
页数:10
相关论文
共 50 条
  • [41] SSPSNet: a single shot panoptic segmentation network for accurate scene parsing
    Wang, Qi
    Wang, Yuanshuai
    Zhou, Yuan
    Wang, Jing
    Jiang, Wuming
    Zhang, Xiangde
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01): : 677 - 688
  • [42] SSPSNet: a single shot panoptic segmentation network for accurate scene parsing
    Qi Wang
    Yuanshuai Wang
    Yuan Zhou
    Jing Wang
    Wuming Jiang
    Xiangde Zhang
    Neural Computing and Applications, 2022, 34 : 677 - 688
  • [43] CENET: CONTENT-AWARE ENHANCED NETWORK FOR PRACTICAL SCENE PARSING
    Song, Kai
    Wang, Zhengtan
    Dai, Huhe
    Zheng, Yuan
    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2024, : 7740 - 7744
  • [44] FEATURE FUSION NETWORK FOR SCENE TEXT DETECTION
    Cai, Chenqin
    Lv, Pin
    Su, Bing
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2755 - 2759
  • [45] A SCENE PARSING METHOD BASED ON SUPER-PIXEL AND MID-LEVEL FEATURE
    Dong, Shidu
    2013 5TH IEEE INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY (IC-BNMT), 2013, : 253 - 256
  • [46] Development of a deep wavelet pyramid scene parsing semantic segmentation network for scene perception in indoor environments
    Aslan S.N.
    Uçar A.
    Güzeliş C.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (09) : 12673 - 12695
  • [47] MFFSP: Multi-scale feature fusion scene parsing network for landslides detection based on high-resolution satellite images
    Li, Penglei
    Wang, Yi
    Si, Tongzhen
    Ullah, Kashif
    Han, Wei
    Wang, Lizhe
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [48] ClothSeg: semantic segmentation network with feature projection for clothing parsing
    Tang, Guangyu
    Yu, Feng
    Li, Huiyin
    Shi, Yankang
    Liu, Li
    Peng, Tao
    Hu, Xinrong
    Jiang, Minghua
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 97
  • [49] Open Vocabulary Scene Parsing
    Zhao, Hang
    Puig, Xavier
    Zhou, Bolei
    Fidler, Sanja
    Torralba, Antonio
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2021 - 2029
  • [50] Incorporating part-whole hierarchies into fully convolutional network for scene parsing
    Abbasi, Karim
    Razzaghi, Parvin
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160