Cross-View Semantic Segmentation for Sensing Surroundings

被引:118
|
作者
Pan, Bowen [1 ]
Sun, Jiankai [2 ]
Leung, Ho Yin Tiga [2 ]
Andonian, Alex [1 ]
Zhou, Bolei [2 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] Chinese Univ Hong Kong, Dept Informat Engn, Hong Kong, Peoples R China
来源
关键词
Semantic scene understanding; deep learning for visual perception; visual learning; visual-based navigation; computer vision for other robotic applications;
D O I
10.1109/LRA.2020.3004325
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Sensing surroundings plays a crucial role in human spatial perception, as it extracts the spatial configuration of objects as well as the free space from the observations. To facilitate the robot perception with such a surrounding sensing capability, we introduce a novel visual task called Cross-view Semantic Segmentation as well as a framework named View Parsing Network (VPN) to address it. In the cross-view semantic segmentation task, the agent is trained to parse the first-view observations into a top-down-view semantic map indicating the spatial location of all the objects at pixel-level. The main issue of this task is that we lack the real-world annotations of top-down-view data. To mitigate this, we train the VPN in 3D graphics environment and utilize the domain adaptation technique to transfer it to handle real-world data. We evaluate our VPN on both synthetic and real-world agents. The experimental results show that our model can effectively make use of the information from different views and multi-modalities to understanding spatial information. Our further experiment on a LoCoBot robot shows that our model enables the surrounding sensing capability from 2D image input. Code and demo videos can be found at https://view-parsing-network.github.io.
引用
收藏
页码:4867 / 4873
页数:7
相关论文
共 50 条
  • [31] Cross-View Label Transfer in Knee MR Segmentation Using Iterative Context Learning
    Li, Tong
    Xuan, Kai
    Xue, Zhong
    Chen, Lei
    Zhang, Lichi
    Qian, Dahong
    [J]. DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 : 96 - 105
  • [32] View Consistent Purification for Accurate Cross-View Localization
    Wang, Shan
    Zhang, Yanhao
    Perincherry, Akhil
    Vora, Ankit
    Li, Hongdong
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8163 - 8172
  • [33] X-Align plus plus : cross-modal cross-view alignment for Bird's-eye-view segmentation
    Borse, Shubhankar
    Klingner, Marvin
    Ravi, Varun
    Cai, Hong
    Almuzairee, Abdulaziz
    Yogamani, Senthil
    Porikli, Fatih
    [J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (04)
  • [34] Cross-view Embeddings for Information Retrieval
    Gupta, Parth
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (62): : 115 - 118
  • [35] Mammogram mass segmentation and classification based on cross-view VAE and spatial hidden factor disentanglement
    Yingran Ma
    Yanjun Peng
    [J]. Physical and Engineering Sciences in Medicine, 2024, 47 : 223 - 238
  • [36] Cross-View Cross-Scene Multi-View Crowd Counting
    Zhang, Qi
    Lin, Wei
    Chan, Antoni B.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 557 - 567
  • [37] Mammogram mass segmentation and classification based on cross-view VAE and spatial hidden factor disentanglement
    Ma, Yingran
    Peng, Yanjun
    [J]. PHYSICAL AND ENGINEERING SCIENCES IN MEDICINE, 2024, 47 (01) : 223 - 238
  • [38] Locality cross-view regression for feature extraction
    Zhang, Jinxin
    Zhang, Hongjie
    Qiang, Wenwen
    Deng, Naiyang
    Jing, Ling
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 105
  • [39] Cross-View Geo-Localization: A Survey
    Durgam, Abhilash
    Paheding, Sidike
    Dhiman, Vikas
    Devabhaktuni, Vijay
    [J]. IEEE Access, 2024, 12 : 192028 - 192050
  • [40] Cross-View Action Recognition via View Knowledge Transfer
    Liu, Jingen
    Shah, Mubarak
    Kuipers, Benjamin
    Savarese, Silvio
    [J]. 2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011,