Cross-View Semantic Segmentation for Sensing Surroundings

被引:118
|
作者
Pan, Bowen [1 ]
Sun, Jiankai [2 ]
Leung, Ho Yin Tiga [2 ]
Andonian, Alex [1 ]
Zhou, Bolei [2 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Peoples R China
来源
关键词
Semantic scene understanding; deep learning for visual perception; visual learning; visual-based navigation; computer vision for other robotic applications;
D O I
10.1109/LRA.2020.3004325
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In the cross-view semantic segmentation task, the agent is trained to parse the first-view observations into a top-down-view semantic map indicating the spatial location of all the objects at pixel-level. The main issue of this task is that we lack the real-world annotations of top-down-view data. To mitigate this, we train the VPN in 3D graphics environment and utilize the domain adaptation technique to transfer it to handle real-world data. We evaluate our VPN on both synthetic and real-world agents. The experimental results show that our model can effectively make use of the information from different views and multi-modalities to understanding spatial information. Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input. Code and demo videos can be found at https://view-parsing-network.github.io.
引用
收藏
页码:4867 / 4873
页数:7
相关论文
共 50 条
  • [1] Semantic Cross-View Matching
    Castaldo, Francesco
    Zamir, Amir
    Angst, Roland
    Palmieri, Francesco
    Savarese, Silvio
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 2015, : 1044 - 1052
  • [2] Towards Cross-View Consistency in Semantic Segmentation While Varying View Direction
    Tong, Xin
    Ying, Xianghua
    Shi, Yongjie
    Zhao, He
    Wang, Ruibin
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1054 - 1060
  • [3] Cross-view Transformers for real-time Map-view Semantic Segmentation
    Zhou, Brady
    Krahenbuhl, Philipp
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13750 - 13759
  • [4] Unsupervised Cross-View Semantic Transfer for Remote Sensing Image Classification
    Sun, Hao
    Liu, Shuai
    Zhou, Shilin
    Zou, Huanxin
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (01) : 13 - 17
  • [5] Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation
    Wang, Zicheng
    Zhao, Zhen
    Xing, Xiaoxia
    Xu, Dong
    Kong, Xiangyu
    Zhou, Luping
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19585 - 19595
  • [6] Cross-view Semantic Alignment for Livestreaming Product Recognition
    Yang, Wenjie
    Chen, Yiyi
    Li, Yan
    Cheng, Yanhua
    Liu, Xudong
    Chen, Quan
    Li, Han
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13358 - 13367
  • [7] Cross-View Regularization for Domain Adaptive Panoptic Segmentation
    Huang, Jiaxing
    Guan, Dayan
    Xiao, Aoran
    Lu, Shijian
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10128 - 10139
  • [8] A Convex Discriminant Semantic Correlation Analysis for Cross-View Recognition
    Tian, Qing
    Ma, Chuang
    Cao, Meng
    Chen, Songcan
    Yin, Hujun
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (02) : 849 - 861
  • [9] CVSformer: Cross-View Synthesis Transformer for Semantic Scene Completion
    Dong, Haotian
    Ma, Enhui
    Wang, Lubo
    Wang, Miaohui
    Xie, Wuyuan
    Guo, Qing
    Li, Ping
    Liang, Lingyu
    Yang, Kairui
    Lin, Di
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8840 - 8849
  • [10] Semantic maps for cross-view relocalization of terrestrial to UAV point clouds
    Xuming, Ge
    Yuting, Fan
    Qing, Zhu
    Bin, Wang
    Bo, Xu
    Han, Hu
    Min, Chen
    [J]. INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 114