Fine-MVO: Toward Fine-Grained Feature Enhancement for Self-Supervised Monocular Visual Odometry in Dynamic Environments

被引:0
|
作者
Wei, Wenhui [1 ,2 ]
Ping, Yang [3 ]
Li, Jiadong [2 ]
Liu, Xin [2 ]
Zhou, Yangfan [2 ,4 ]
机构
[1] Univ Sci & Technol China, Sch Nanotech & Nanobion, Hefei 230026, Peoples R China
[2] Chinese Acad Sci, Suzhou Inst Nanotech & Nanobion SINANO, Suzhou 215123, Peoples R China
[3] Acad Mil Sci, Beijing 100091, Peoples R China
[4] Guangdong Inst Semicond Micronano Mfg Technol, Foshan 528000, Peoples R China
关键词
Semantics; Training; Pose estimation; Task analysis; Visual odometry; Robustness; Multitasking; Monocular visual odometry; feature enhancement; self-supervised learning; dynamic environments; multi-task learning;
D O I
10.1109/TITS.2024.3404924
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Self-supervised monocular visual odometry has a crucial advantage of not depending on labels and has shown significant performance in autonomous driving and robotics. However, recent methods suffer from limited feature representations as they depend on coarse semantic masks to handle dynamic objects, resulting in diminished accuracy in dynamic environments. In contrast to these coarse-grained methods, we present Fine-MVO, a novel self-supervised monocular visual odometry that aims to address dynamic objects using implicit fine-grained feature representations, thus achieving excellent accuracy and robustness in dynamic environments. First, Fine-MVO provides an efficient cross-feature augmentation module and a novel loss weight balance strategy to effectively leverage fine-grained features with implicit semantic information, leading to a great improvement in the depth estimation accuracy, especially on object boundaries in the scenes. Secondly, we design a novel pose-feature enhancement module and an effective two-stage training policy to empower the pose network to focus on robust static regions and temporal information, thereby enhancing the pose estimation performance in dynamic and long-term environments. Extensive experimental results demonstrate the excellent accuracy and generalization of Fine-MVO. Specifically, Fine-MVO achieves a remarkable 36.80% improvement in pose accuracy over the state-of-the-art method on the KITTI dataset, which even breaks through the performance of loop closure within geometry-based visual odometry methods. Furthermore, Fine-MVO exhibits satisfactory generalization on the outdoor dataset AirDOS-Shibuya, attaining a notable improvement of 28.22% over current advanced method. Excitingly, Fine-MVO also reveals outstanding generalization on the indoor dataset TUM-RGBD.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Driven to Distraction: Self-Supervised Distractor Learning for Robust Monocular Visual Odometry in Urban Environments
    Barnes, Dan
    Maddern, Will
    Pascoe, Geoffrey
    Posner, Ingmar
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 1894 - 1900
  • [22] A self-supervised monocular odometry with visual-inertial and depth representations
    Zhao, Lingzhe
    Xiang, Tianyu
    Wang, Zhuping
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (06):
  • [23] Transformer-Based Self-Supervised Monocular Depth and Visual Odometry
    Zhao, Hongru
    Qiao, Xiuquan
    Ma, Yi
    Tafazolli, Rahim
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (02) : 1436 - 1446
  • [24] LEARNING BY INERTIA: SELF-SUPERVISED MONOCULAR VISUAL ODOMETRY FOR ROAD VEHICLES
    Wang, Chengze
    Yuan, Yuan
    Wang, Qi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2252 - 2256
  • [25] Self-supervised monocular visual odometry based on cross-correlation
    Hu, Jiaxin
    Tao, Bo
    Qian, Xinbo
    Jiang, Du
    Li, Gongfa
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (08)
  • [26] Enhancing self-supervised monocular depth estimation with traditional visual odometry
    Andraghetti, Lorenzo
    Myriokefalitakis, Panteleimon
    Dovesi, Pier Luigi
    Luque, Belen
    Poggi, Matteo
    Pieropan, Alessandro
    Mattoccia, Stefano
    [J]. 2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 424 - 433
  • [27] Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
    Chen, Haiyuan
    Cheng, Lianglun
    Huang, Guoheng
    Zhang, Ganghan
    Lan, Jiaying
    Yu, Zhiwen
    Pun, Chi-Man
    Ling, Wing-Kuen
    [J]. APPLIED INTELLIGENCE, 2022, 52 (13) : 15673 - 15689
  • [28] Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
    Haiyuan Chen
    Lianglun Cheng
    Guoheng Huang
    Ganghan Zhang
    Jiaying Lan
    Zhiwen Yu
    Chi-Man Pun
    Wing-Kuen Ling
    [J]. Applied Intelligence, 2022, 52 : 15673 - 15689
  • [29] Self-Supervised GlobalLocal Contrastive Learning for Fine-Grained Change Detection in VHR Images
    Jiang, Fenlong
    Gong, Maoguo
    Zheng, Hanhong
    Liu, Tongfei
    Zhang, Mingyang
    Liu, Jialu
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [30] Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning
    Kim, Sungnyun
    Bae, Sangmin
    Yun, Young
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7537 - 7547