Indoor Objects and Outdoor Urban Scenes Recognition by 3D Visual Primitives

被引：1

作者：

Fu, Junsheng ^{[1
,3
]}

Kamarainen, Joni-Kristian ^{[1
]}

Buch, Anders Glent ^{[2
]}

Kruger, Norbert ^{[2
]}

机构：

[1] Tampere Univ Technol, Vis Grp, FIN-33101 Tampere, Finland

[2] Univ Southern Denmark, CARO Grp, Odense, Denmark

[3] Nokia Res Ctr, Tampere, Finland

来源：

COMPUTER VISION - ACCV 2014 WORKSHOPS, PT I | 2015年 / 9008卷

关键词：

FEATURES;

D O I：

10.1007/978-3-319-16628-5_20

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection, recognition and pose estimation in 3D images have gained momentum due to availability of 3D sensors (RGB-D) and increase of large scale 3D data, such as city maps. The most popular approach is to extract and match 3D shape descriptors that encode local scene structure, but omits visual appearance. Visual appearance can be problematic due to imaging distortions, but the assumption that local shape structures are sufficient to recognise objects and scenes is largely invalid in practise since objects may have similar shape, but different texture (e.g., grocery packages). In this work, we propose an alternative appearance-driven approach which first extracts 2D primitives justified by Marr's primal sketch, which are "accumulated" over multiple views and the most stable ones are "promoted" to 3D visual primitives. The 3D promoted primitives represent both structure and appearance. For recognition, we propose a fast and effective correspondence matching using random sampling. For quantitative evaluation we construct a semisynthetic benchmark dataset using a public 3D model dataset of 119 kitchen objects and another benchmark of challenging street-view images from 4 different cities. In the experiments, our method utilises only a stereo view for training. As the result, with the kitchen objects dataset our method achieved almost perfect recognition rate for +/- 10 degrees camera view point change and nearly 80% for +/- 20 degrees, and for the street-view benchmarks it achieved 75% accuracy for 160 street-view images pairs, 80% for 96 street-view images pairs, and 92% for 48 street-view image pairs.

引用

页码：270 / 285

页数：16

共 50 条

[31] Invariant object recognition in the visual system with novel views of 3D objects
Stringer, SM
Rolls, ET
NEURAL COMPUTATION, 2002, 14 (11) : 2585 - 2596
[32] PIIE-DSA-Net for 3D Semantic Segmentation of Urban Indoor and Outdoor Datasets
Gao, Fengjiao
Yan, Yiming
Lin, Hemin
Shi, Ruiyao
REMOTE SENSING, 2022, 14 (15)
[33] 3D Laser Scanning System and 3D Segmentation of Urban Scenes
Goron, L. C.
Tamas, L.
Reti, I.
Lazea, G.
PROCEEDINGS OF 2010 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS (AQTR 2010), VOLS. 1-3, 2010,
[34] Recognition of Indoor Scenes Using 3-D Scene Graphs
Yue, Han
Lehtola, Ville
Wu, Hangbin
Vosselman, George
Li, Jincheng
Liu, Chun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 16
[35] Label Propagation for Large Scale 3D Indoor Scenes
Tang, Keke
Zhao, Zhe
Chen, Xiaoping
ADVANCES IN VISUAL COMPUTING, PT I (ISVC 2015), 2015, 9474 : 253 - 264
[36] Synthesizing Diverse Human Motions in 3D Indoor Scenes
Zhao, Kaifeng
Zhang, Yan
Wang, Shaofei
Beeler, Thabo
Tang, Siyu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14692 - 14703
[37] LaserBrush: A Flexible Device for 3D Reconstruction of Indoor Scenes
Habbecke, Martin
Kobbelt, Leif
SPM 2008: PROCEEDINGS OF THE ACM SOLID AND PHYSICAL MODELING SYMPOSIUM, 2008, : 231 - 239
[38] 3D Reconstruction of Indoor Scenes via Image Registration
Li, Ce
Lu, Bing
Zhang, Yachao
Liu, Hao
Qu, Yanyun
NEURAL PROCESSING LETTERS, 2018, 48 (03) : 1281 - 1304
[39] A 3D Reconstruction System of Indoor Scenes with Rotating Platform
Zhang, Feng
Shi, Limin
Xu, Zhenhui
Hu, Zhanyi
ISCSCT 2008: INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND COMPUTATIONAL TECHNOLOGY, VOL 2, PROCEEDINGS, 2008, : 554 - 558
[40] Understanding Indoor Scenes using 3D Geometric Phrases
Choi, Wongun
Chao, Yu-Wei
Pantofaru, Caroline
Savarese, Silvio
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 33 - 40

← 1 2 3 4 5 →