A computational framework for attentional object discovery in RGB-D videos

被引：0

作者：

Garcia, German Martin ^{[1
]}

Pavel, Mircea ^{[1
]}

Frintrop, Simone ^{[2
]}

机构：

[1] Univ Bonn, Inst Comp Sci 6, Bonn, Germany

[2] Univ Hamburg, Dept Informat, Comp Vis Grp, Hamburg, Germany

来源：

COGNITIVE PROCESSING | 2017年 / 18卷 / 02期

关键词：

RGB-D object discovery; Computational visual attention; 3D inhibition of return; INHIBITION; RETURN; MODEL;

D O I：

10.1007/s10339-017-0791-z

中图分类号：

B84 [心理学];

学科分类号：

04 ; 0402 ;

摘要：

We present a computational framework for attention-guided visual scene exploration in sequences of RGB-D data. For this, we propose a visual object candidate generation method to produce object hypotheses about the objects in the scene. An attention system is used to prioritise the processing of visual information by (1) localising candidate objects, and (2) integrating an inhibition of return (IOR) mechanism grounded in spatial coordinates. This spatial IOR mechanism naturally copes with camera motions and inhibits objects that have already been the target of attention. Our approach provides object candidates which can be processed by higher cognitive modules such as object recognition. Since objects are basic elements for many higher level tasks, our architecture can be used as a first layer in any cognitive system that aims at interpreting a stream of images. We show in the evaluation how our framework finds most of the objects in challenging real-world scenes.

引用

页码：169 / 182

页数：14

共 50 条

[31] ObjectFusion: An object detection and segmentation framework with RGB-D SLAM and convolutional neural networks
Tian, Guanzhong
Liu, Liang
Ri, JongHyok
Liu, Yong
Sun, Yiran
[J]. NEUROCOMPUTING, 2019, 345 : 3 - 14
[32] Estimating Spatial Layout of Rooms from RGB-D Videos
Wang, Anran
Lu, Jiwen
Cai, Jianfei
Wang, Gang
Cham, Tat-Jen
[J]. 2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
[33] Human activity recognition in RGB-D videos by dynamic images
Mukherjee, Snehasis
Anvitha, Leburu
Lahari, T. Mohana
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (27-28) : 19787 - 19801
[34] Reconstructing Articulated Rigged Models from RGB-D Videos
Tzionas, Dimitrios
Gall, Juergen
[J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 620 - 633
[35] Viewpoint Invariant Action Recognition Using RGB-D Videos
Liu, Jian
Akhtar, Naveed
Mian, Ajmal
[J]. IEEE ACCESS, 2018, 6 : 70061 - 70071
[36] Predicting Human Activities in Sequences of Actions in RGB-D Videos
Jardim, David
Nunes, Luis
Dias, Miguel
[J]. NINTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2016), 2017, 10341
[37] Semantic segmentation with Recurrent Neural Networks on RGB-D videos
Gao, Chuan
Wang, Weihong
Chen, Mingxi
[J]. 2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1203 - 1207
[38] Human activity recognition in RGB-D videos by dynamic images
Snehasis Mukherjee
Leburu Anvitha
T. Mohana Lahari
[J]. Multimedia Tools and Applications, 2020, 79 : 19787 - 19801
[39] Recognition and Classification of Human Activity from RGB-D Videos
Gurkaynak, Deniz
Yalcin, Hulya
[J]. 2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 1745 - 1748
[40] Object Learning for 6D Pose Estimation and Grasping from RGB-D Videos of In-hand Manipulation
Patten, Timothy
Park, Kiru
Leitner, Markus
Wolfram, Kevin
Vincze, Markus
[J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4831 - 4838

← 1 2 3 4 5 →