Cross-View Semantic Segmentation for Sensing Surroundings

被引:153
|
作者
Pan, Bowen [1 ]
Sun, Jiankai [2 ]
Leung, Ho Yin Tiga [2 ]
Andonian, Alex [1 ]
Zhou, Bolei [2 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Peoples R China
关键词
Semantic scene understanding; deep learning for visual perception; visual learning; visual-based navigation; computer vision for other robotic applications;
D O I
10.1109/LRA.2020.3004325
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In the cross-view semantic segmentation task, the agent is trained to parse the first-view observations into a top-down-view semantic map indicating the spatial location of all the objects at pixel-level. The main issue of this task is that we lack the real-world annotations of top-down-view data. To mitigate this, we train the VPN in 3D graphics environment and utilize the domain adaptation technique to transfer it to handle real-world data. We evaluate our VPN on both synthetic and real-world agents. The experimental results show that our model can effectively make use of the information from different views and multi-modalities to understanding spatial information. Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input. Code and demo videos can be found at https://view-parsing-network.github.io.
引用
收藏
页码:4867 / 4873
页数:7
相关论文
共 50 条
  • [31] From Satellite to Ground: Satellite Assisted Visual Localization with Cross-view Semantic Matching
    Zhang, Guofeng (zhangguofeng@zju.edu.cn), 1600, Institute of Electrical and Electronics Engineers Inc.
  • [32] From Satellite to Ground: Satellite Assisted Visual Localization with Cross-view Semantic Matching
    Guo, Xiyue
    Peng, Haocheng
    Hu, Junjie
    Bao, Hujun
    Zhang, Guofeng
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 3977 - 3983
  • [33] Convolutional Cross-View Pose Estimation
    Xia, Zimin
    Booij, Olaf
    Kooij, Julian F. P.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3813 - 3831
  • [34] Multiview Co-segmentation for Wide Baseline Images using Cross-view Supervision
    Yao, Yuan
    Park, Hyun Soo
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1931 - 1940
  • [35] Cross-View Label Transfer in Knee MR Segmentation Using Iterative Context Learning
    Li, Tong
    Xuan, Kai
    Xue, Zhong
    Chen, Lei
    Zhang, Lichi
    Qian, Dahong
    DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 : 96 - 105
  • [36] Cross-View Panorama Image Synthesis
    Wu, Songsong
    Tang, Hao
    Jing, Xiao-Yuan
    Zhao, Haifeng
    Qian, Jianjun
    Sebe, Nicu
    Yan, Yan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3546 - 3559
  • [37] Cross-View Fusion for Multi-View Clustering
    Huang, Zhijie
    Huang, Binqiang
    Zheng, Qinghai
    Yu, Yuanlong
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 621 - 625
  • [38] X-Align plus plus : cross-modal cross-view alignment for Bird's-eye-view segmentation
    Borse, Shubhankar
    Klingner, Marvin
    Ravi, Varun
    Cai, Hong
    Almuzairee, Abdulaziz
    Yogamani, Senthil
    Porikli, Fatih
    MACHINE VISION AND APPLICATIONS, 2023, 34 (04)
  • [39] View Consistent Purification for Accurate Cross-View Localization
    Wang, Shan
    Zhang, Yanhao
    Perincherry, Akhil
    Vora, Ankit
    Li, Hongdong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8163 - 8172
  • [40] Cross-view Embeddings for Information Retrieval
    Gupta, Parth
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (62): : 115 - 118