Deep 6-DoF camera relocalization in variable and dynamic scenes by multitask learning

被引:4
|
作者
Wang, Junyi [1 ,2 ]
Qi, Yue [1 ,2 ,3 ]
机构
[1] Beihang Univ, Sch Comp Sci & Engn, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
[3] Beihang Univ, Qingdao Res Inst, Qingdao, Peoples R China
基金
中国国家自然科学基金;
关键词
Image-based localization; Deep learning; Dynamic localization; Multitask learning; LOCALIZATION; ROBUST; TRACKING;
D O I
10.1007/s00138-023-01388-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, direct visual localization with convolutional neural networks has attracted researchers' attention with achieving an end-to-end process. However, on the one side, the lack of using 3D information leads to imprecise accuracy. Meanwhile, the single input image confuses the relocalization in the scenes that keep similar views at different positions. On the other side, the relocalization problem in variable or dynamic scenes is still challenging. Concentrating on these concerns, we propose two multitask relocalization networks called MMLNet and MMLNet+ for obtaining the 6-DoF camera pose in static, variable and dynamic scenes. Firstly, addressing the dataset lack of variable scenes, we construct a variable scene dataset with a semiautomatic process combining SFM and MVS algorithms with a few manual labels. Based on the process, three scenes covering an office, a bedroom and a sitting room are gathered and generated. Secondly, to enhance the perception between 2D images and 3D poses, we design a multitask network called MMLNet that regresses both camera pose and scene point cloud. Meanwhile, the Chamfer distance is joined into the original pose loss to optimize MMLNet. Moreover, MMLNet learns the pose trajectory feature by using LSTM layers to the additional pose array input, which meanwhile breaks through the limitation of single image input. Based on the MMLNet, aiming at dynamic and variable scenes, MMLNet+ outputs the auxiliary segmentation branch that distinguishes fixed, changeable or dynamic parts of the input image. Furthermore, we define the feature fusion block to implement the feature sharing among three tasks, further promoting the performance in dynamic and variable environments. Finally, experiments on static, dynamic and our constructed variable datasets demonstrate state-of-the-art relocalization performances of MMLNet and MMLNet+. Simultaneously, the positive effects of the pose learning part, reconstruction branch and segmentation task are also illustrated.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Partial dynamic balancing of a 6-DOF haptic device
    Liu, Taoran
    Gao, Feng
    Qi, Chenkun
    Zhao, Xianchao
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2014, 228 (14) : 2632 - 2641
  • [32] Dynamic Modeling and Analysis for 6-DOF Industrial Robots
    Li, Yingjie
    Na, Jing
    Gao, Guanbin
    PROCEEDINGS OF 2020 IEEE 9TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS'20), 2020, : 247 - 252
  • [33] Research on a conceptual 6-DOF camera manipulator mounted on the space capsule
    Sun, Hanxu
    Zhang, Chengkun
    Hong, Lei
    Jia, Qingxuan
    2007 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS, VOLS 1-6, 2007, : 1304 - 1309
  • [34] Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes
    Chen, Siang
    Tang, Wei
    Xie, Pengwei
    Yang, Wenming
    Wang, Guijin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (08) : 4895 - 4902
  • [35] Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes
    Sundermeyer, Martin
    Mousavian, Arsalan
    Triebel, Rudolph
    Fox, Dieter
    Proceedings - IEEE International Conference on Robotics and Automation, 2021, 2021-May : 3133 - 3139
  • [36] Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes
    Sundermeyer, Martin
    Mousavian, Arsalan
    Triebel, Rudolph
    Fox, Dieter
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13438 - 13444
  • [37] B-Pose: Bayesian Deep Network for Camera 6-DoF Pose Estimation From RGB Images
    Rekavandi, Aref Miri
    Boussaid, Farid
    Seghouane, Abd-Krim
    Bennamoun, Mohammed
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6747 - 6754
  • [38] A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
    Chai, Jiajun
    Chen, Wenzhang
    Zhu, Yuanheng
    Yao, Zong-Xin
    Zhao, Dongbin
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (09): : 5417 - 5429
  • [39] Design and Simulation Analysis of Trajectory Planning Algorithm for 6-DoF Manipulator Based on Deep Learning
    Zhuang, Min
    Li, Ge
    Ding, Kexin
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [40] Deep Learning-Based 6-DoF Object Pose Estimation Considering Synthetic Dataset
    Zheng, Tianyu
    Zhang, Chunyan
    Zhang, Shengwen
    Wang, Yanyan
    SENSORS, 2023, 23 (24)