Cross-View Semantic Segmentation for Sensing Surroundings

被引:153
|
作者
Pan, Bowen [1 ]
Sun, Jiankai [2 ]
Leung, Ho Yin Tiga [2 ]
Andonian, Alex [1 ]
Zhou, Bolei [2 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Peoples R China
关键词
Semantic scene understanding; deep learning for visual perception; visual learning; visual-based navigation; computer vision for other robotic applications;
D O I
10.1109/LRA.2020.3004325
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In the cross-view semantic segmentation task, the agent is trained to parse the first-view observations into a top-down-view semantic map indicating the spatial location of all the objects at pixel-level. The main issue of this task is that we lack the real-world annotations of top-down-view data. To mitigate this, we train the VPN in 3D graphics environment and utilize the domain adaptation technique to transfer it to handle real-world data. We evaluate our VPN on both synthetic and real-world agents. The experimental results show that our model can effectively make use of the information from different views and multi-modalities to understanding spatial information. Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input. Code and demo videos can be found at https://view-parsing-network.github.io.
引用
收藏
页码:4867 / 4873
页数:7
相关论文
共 50 条
  • [1] Semantic Cross-View Matching
    Castaldo, Francesco
    Zamir, Amir
    Angst, Roland
    Palmieri, Francesco
    Savarese, Silvio
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOP (ICCVW), 2015, : 1044 - 1052
  • [2] Towards Cross-View Consistency in Semantic Segmentation While Varying View Direction
    Tong, Xin
    Ying, Xianghua
    Shi, Yongjie
    Zhao, He
    Wang, Ruibin
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1054 - 1060
  • [3] An Alternating Guidance With Cross-View TeacherStudent Framework for Remote Sensing Semi-Supervised Semantic Segmentation
    Fu, Yujia
    Wang, Mingyang
    Vivone, Gemine
    Ding, Yunhong
    Zhang, Lin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [4] Cross-view Transformers for real-time Map-view Semantic Segmentation
    Zhou, Brady
    Krahenbuhl, Philipp
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13750 - 13759
  • [5] Unsupervised Cross-View Semantic Transfer for Remote Sensing Image Classification
    Sun, Hao
    Liu, Shuai
    Zhou, Shilin
    Zou, Huanxin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (01) : 13 - 17
  • [6] A guided approach for cross-view geolocalization estimation with land cover semantic segmentation
    Xavier, Nathan A. Z.
    Shiguemori, Elcio H.
    Maximo, Marcos R. O. A.
    Shah, Mubarak
    BIOMIMETIC INTELLIGENCE AND ROBOTICS, 2025, 5 (02):
  • [7] Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation
    Wang, Zicheng
    Zhao, Zhen
    Xing, Xiaoxia
    Xu, Dong
    Kong, Xiangyu
    Zhou, Luping
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19585 - 19595
  • [8] Bridge point cloud semantic segmentation based on view consensus and cross-view self-prompt fusion
    Zeng, Yan
    Huang, Feng
    Xiong, Guikai
    Ma, Xiaoxiao
    Peng, Yingchuan
    Yang, Wenshu
    Liu, Jiepeng
    AUTOMATION IN CONSTRUCTION, 2025, 171
  • [9] CrossMatch: Cross-View Matching for Semi-Supervised Remote Sensing Image Segmentation
    Liu, Ruizhong
    Luo, Tingzhang
    Huang, Shaoguang
    Wu, Yuwei
    Jiang, Zhen
    Zhang, Hongyan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [10] Cross-view Semantic Alignment for Livestreaming Product Recognition
    Yang, Wenjie
    Chen, Yiyi
    Li, Yan
    Cheng, Yanhua
    Liu, Xudong
    Chen, Quan
    Li, Han
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 13358 - 13367