A computational framework for attentional object discovery in RGB-D videos

被引:0
|
作者
Garcia, German Martin [1 ]
Pavel, Mircea [1 ]
Frintrop, Simone [2 ]
机构
[1] Univ Bonn, Inst Comp Sci 6, Bonn, Germany
[2] Univ Hamburg, Dept Informat, Comp Vis Grp, Hamburg, Germany
关键词
RGB-D object discovery; Computational visual attention; 3D inhibition of return; INHIBITION; RETURN; MODEL;
D O I
10.1007/s10339-017-0791-z
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
We present a computational framework for attention-guided visual scene exploration in sequences of RGB-D data. For this, we propose a visual object candidate generation method to produce object hypotheses about the objects in the scene. An attention system is used to prioritise the processing of visual information by (1) localising candidate objects, and (2) integrating an inhibition of return (IOR) mechanism grounded in spatial coordinates. This spatial IOR mechanism naturally copes with camera motions and inhibits objects that have already been the target of attention. Our approach provides object candidates which can be processed by higher cognitive modules such as object recognition. Since objects are basic elements for many higher level tasks, our architecture can be used as a first layer in any cognitive system that aims at interpreting a stream of images. We show in the evaluation how our framework finds most of the objects in challenging real-world scenes.
引用
收藏
页码:169 / 182
页数:14
相关论文
共 50 条
  • [31] ObjectFusion: An object detection and segmentation framework with RGB-D SLAM and convolutional neural networks
    Tian, Guanzhong
    Liu, Liang
    Ri, JongHyok
    Liu, Yong
    Sun, Yiran
    [J]. NEUROCOMPUTING, 2019, 345 : 3 - 14
  • [32] Estimating Spatial Layout of Rooms from RGB-D Videos
    Wang, Anran
    Lu, Jiwen
    Cai, Jianfei
    Wang, Gang
    Cham, Tat-Jen
    [J]. 2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
  • [33] Human activity recognition in RGB-D videos by dynamic images
    Mukherjee, Snehasis
    Anvitha, Leburu
    Lahari, T. Mohana
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (27-28) : 19787 - 19801
  • [34] Reconstructing Articulated Rigged Models from RGB-D Videos
    Tzionas, Dimitrios
    Gall, Juergen
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 620 - 633
  • [35] Viewpoint Invariant Action Recognition Using RGB-D Videos
    Liu, Jian
    Akhtar, Naveed
    Mian, Ajmal
    [J]. IEEE ACCESS, 2018, 6 : 70061 - 70071
  • [36] Predicting Human Activities in Sequences of Actions in RGB-D Videos
    Jardim, David
    Nunes, Luis
    Dias, Miguel
    [J]. NINTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2016), 2017, 10341
  • [37] Semantic segmentation with Recurrent Neural Networks on RGB-D videos
    Gao, Chuan
    Wang, Weihong
    Chen, Mingxi
    [J]. 2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1203 - 1207
  • [38] Human activity recognition in RGB-D videos by dynamic images
    Snehasis Mukherjee
    Leburu Anvitha
    T. Mohana Lahari
    [J]. Multimedia Tools and Applications, 2020, 79 : 19787 - 19801
  • [39] Recognition and Classification of Human Activity from RGB-D Videos
    Gurkaynak, Deniz
    Yalcin, Hulya
    [J]. 2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 1745 - 1748
  • [40] Object Learning for 6D Pose Estimation and Grasping from RGB-D Videos of In-hand Manipulation
    Patten, Timothy
    Park, Kiru
    Leitner, Markus
    Wolfram, Kevin
    Vincze, Markus
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4831 - 4838