Scene captioning with deep fusion of images and point clouds

被引:0
|
作者
Yu, Qiang [1 ,3 ]
Zhang, Chunxia [4 ]
Weng, Lubin [1 ]
Xiang, Shiming [2 ,3 ]
Pan, Chunhong [2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Res Ctr Aerosp Informat, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[4] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene captioning; Point cloud; Deep fusion;
D O I
10.1016/j.patrec.2022.04.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the fusion of images and point clouds has received appreciable attentions in various fields, for example, autonomous driving, whose advantage over single-modal vision has been verified. However, it has not been extensively exploited in the scene captioning task. In this paper, a novel scene captioning framework with deep fusion of images and point clouds based on region correlation and attention is proposed to improve performances of captioning models. In our model, a symmetrical processing pipeline is designed for point clouds and images. First, 3D and 2D region features are generated respectively through region proposal generation, proposal fusion, and region pooling modules. Then, a feature fusion module is designed to integrate features according to the region correlation rule and the attention mechanism, which increases the interpretability of the fusion process and results in a sequence of fused visual features. Finally, the fused features are transformed into captions by an attention-based caption generation module. Comprehensive experiments indicate that the performance of our model reaches the state of the art.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:9 / 15
页数:7
相关论文
共 50 条
  • [1] Fusion of airborne laserscanning point clouds and images for supervised and unsupervised scene classification
    Gerke, Markus
    Xiao, Jing
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 87 : 78 - 92
  • [2] Deep Scene Flow Learning: From 2D Images to 3D Point Clouds
    Harbin Engineering University, School of Information and Communication Engineering, Heilongjiang, Harbin
    150001, China
    不详
    150001, China
    不详
    ON
    K1N 6N5, Canada
    [J]. IEEE Trans Pattern Anal Mach Intell, 2024, 1 (185-208):
  • [3] A rich RGBD images captioning for scene understanding
    Delloul, Khadidja
    Larabi, Slimane
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, : 8031 - 8048
  • [4] Deep Scene Flow Learning: From 2D Images to 3D Point Clouds
    Xiang, Xuezhi
    Abdein, Rokia
    Li, Wei
    El Saddik, Abdulmotaleb
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (01) : 185 - 208
  • [5] 3D-SceneCaptioner: Visual Scene Captioning Network for Three-Dimensional Point Clouds
    Yu, Qiang
    Pan, Xianbing
    Xiang, Shiming
    Pan, Chunhong
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 275 - 286
  • [6] Captioning the Images: A Deep Analysis
    Chaudhari, Chaitrali P.
    Devane, Satish
    [J]. COMPUTING, COMMUNICATION AND SIGNAL PROCESSING, ICCASP 2018, 2019, 810 : 987 - 999
  • [7] APNet: Urban-level Scene Segmentation of Aerial Images and Point Clouds
    Wei, Weijie
    Oswald, Martin R.
    Nejadasl, Fatemeh Karimi
    Gevers, Theo
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1747 - 1756
  • [8] Deep Learning for Scene Flow Estimation on Point Clouds: A Survey and Prospective Trends
    Li, Zhiqi
    Xiang, Nan
    Chen, Honghua
    Zhang, Jianjun
    Yang, Xiaosong
    [J]. COMPUTER GRAPHICS FORUM, 2023, 42 (06)
  • [9] Fusion of Laser Point Clouds and Color Images with Post-calibration
    Zhang, Xiao-Chuan
    Lin, Qiu-Hua
    Hao, Ying-Guang
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2018, 2018, 10878 : 549 - 556
  • [10] DETECTION OF CEILING SAGGING BASED ON DEEP LEARNING OF IMAGES AND POINT CLOUDS
    Morita, Koichi
    [J]. AIJ Journal of Technology and Design, 2024, 30 (74) : 165 - 169