Monocular three-dimensional object detection using data augmentation and self-supervised learning in autonomous driving

被引：1

作者：

Thayalan, Sugirtha ^{[1
]}

Muthukumarasamy, Sridevi ^{[1
]}

Santhakumar, Khailash ^{[2
]}

Ravi, Kiran Bangalore ^{[3
]}

Liu, Hao ^{[3
]}

Gauthier, Thomas ^{[3
]}

Yogamani, Senthil ^{[4
]}

机构：

[1] Natl Inst Technol, Dept Comp Sci & Engn, Trichy, Tamil Nadu, India

[2] SASTRA Univ, Thanjavur, Tamilnadu, India

[3] Navya, Paris, France

[4] Valeo Vis Syst, Comp Vis Platform, Tuam, Ireland

来源：

JOURNAL OF ELECTRONIC IMAGING | 2023年 / 32卷 / 01期

关键词：

monocular three-dimensional detection; data augmentation; self-supervised learning;

D O I：

10.1117/1.JEI.32.1.011004

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Monocular three-dimensional (3D) object detection (OD) is an essential and challenging task in the domain of autonomous driving. Modern convolution neural network-based architectures for OD heavily rely on data augmentation (DA) and self-supervised learning (SSL). However, they have been relatively less explored for monocular 3D OD, especially in the field of autonomous driving. DAs for two-dimensional OD techniques do not directly extend to the 3D objects. Literature shows that this requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoints. This requires accurate depth information of the scene which may not be available always. We propose augmentations for monocular 3D OD without creating view synthesis. The proposed method uses DA with SSL approach via multiobject labeling as the pretext task. We evaluate the proposed DA-SSL approach on RTM3D detection model (baseline), with and without the application of DA. The results demonstrate improvements between 2% and 3% in mAP 3D and 0.9% to 1.5% BEV scores using SSL over the baseline scores. We propose an inverse class frequency weighted (ICFW) mAP score that highlights improvements in detection for low-frequency classes in a class imbalanced datasets with long tails. We observe improvements in both ICFW mAP 3D and Bird's Eye View (BEV) scores to take into account the class imbalance in the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) validation dataset. We achieve 4% to 5% increase in ICFW metrics with the pretext task.

引用

页数：19

共 50 条

[41] Obstacle Detection from Overhead Imagery using Self-Supervised Learning for Autonomous Surface Vehicles
Heidarsson, Hordur K.
Sukhatme, Gaurav S.
2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2011, : 3160 - 3165
[42] Joint data and feature augmentation for self-supervised representation learning on point clouds
Lu, Zhuheng
Dai, Yuewei
Li, Weiqing
Su, Zhiyong
GRAPHICAL MODELS, 2023, 129
[43] Self-Supervised Action Representation Learning Based on Asymmetric Skeleton Data Augmentation
Zhou, Hualing
Li, Xi
Xu, Dahong
Liu, Hong
Guo, Jianping
Zhang, Yihan
SENSORS, 2022, 22 (22)
[44] Self-Supervised Graph Representation Learning Method Based on Data and Feature Augmentation
Xu, Yunfeng
Fan, Hexun
Computer Engineering and Applications, 2024, 60 (17) : 148 - 157
[45] Traffic Accident Detection via Self-Supervised Consistency Learning in Driving Scenarios
Fang, Jianwu
Qiao, Jiahuan
Bai, Jie
Yu, Hongkai
Xue, Jianru
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (07) : 9601 - 9614
[46] Monocular 3D object detection using dual quadric for autonomous driving
Li, Peixuan
Zhao, Huaici
NEUROCOMPUTING, 2021, 441 : 151 - 160
[47] Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection
Wang, Tiancai
Yang, Tong
Cao, Jiale
Zhang, Xiangyu
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2800 - 2808
[48] Self-supervised Cross-stage Regional Contrastive Learning for Object Detection
Yan, Junkai
Yang, Lingxiao
Gao, Yipeng
Zheng, Wei-Shi
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1044 - 1049
[49] Treasure in the background: Improve saliency object detection by self-supervised contrast learning
Dong, Haoji
Wu, Jie
Xing, Chengcheng
Xi, Heran
Cui, Hui
Zhu, Jinghua
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 267
[50] Time-to-Label: Temporal Consistency for Self-Supervised Monocular 3D Object Detection
Mouawad, Issa
Brasch, Nikolas
Manhardt, Fabian
Tombari, Federico
Odone, Francesca
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04): : 8988 - 8995

← 1 2 3 4 5 →