A developmental where-what network for concurrent and interactive visual attention and recognition

被引:3
|
作者
Ji, Zhengping [1 ]
Weng, Juyang [2 ]
机构
[1] Samsung Semicond Inc, Adv Image Res Lab ARIL, Pasadena, CA 91103 USA
[2] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
关键词
Developmental learning; Where-what sensorimotor pathways; Attention; Recognition; Brain-inspired neural network; MODEL; ALGORITHM; CORTEX; LAYERS;
D O I
10.1016/j.robot.2015.03.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a brain-inspired developmental architecture called Where-What Network (WWN). In this second version of WWN, WWN-2 is learned for concurrent and interactive visual attention and recognition, via complementary pathways guided by "type" motor and "location" motor. The motor-driven top-down signals, together with bottom-up excitatory activities from the visual input, shape three possible information flows through a Y-shaped network. Using l(0) constrained sparse coding scheme, the top-down and bottom-up co-firing leads to a non-iterative cell-centered synaptic update model, entailing the strict entropy reduction from early to later layers, as well as a dual optimization of update directions and step sizes that dynamically depend on the firing ages of the neurons. Three operational modes for cluttered scenes emerge from the learning process, depending on what is available in the motor area: context-free mode for detection and recognition from a cluttered scene for a learned object, location-context mode for doing object recognition, and type-context mode for doing object search, all by a single network. To demonstrate the attention capabilities along with their interaction of visual processing, the proposed network is in the presence of complex backgrounds, learns on the fly, and produces engineering graded performance regarding attended pixel errors and recognition accuracy. As the proposed architecture is developmental, meaning that the internal representations are learned from pairs of input and motor signal, and thereby not manipulated internally for a specific task, we argue that the same learning principles and computational architecture can be potentially applicable to other sensory modalities, such as audition and touch. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:35 / 48
页数:14
相关论文
共 50 条
  • [1] Where-What Network 3: Developmental Top-Down Attention for Multiple Foregrounds and Complex Backgrounds
    Luciw, Matthew
    Weng, Juyang
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [2] Where-What Network with CUDA: General Object Recognition and Location in Complex Backgrounds
    Wang, Yuekai
    Wu, Xiaofeng
    Song, Xiaoying
    Zhang, Wengqiang
    Weng, Juyang
    ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT II, 2011, 6676 : 331 - +
  • [3] The What and Where of Visual Attention
    Moore, Tirin
    Zirnsak, Marc
    NEURON, 2015, 88 (04) : 626 - 628
  • [4] Where-What Network 5: Dealing with Scales for Objects in Complex Background
    Song, Xiaoying
    Zhang, Wenqiang
    Weng, Juyang
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2795 - 2802
  • [5] Where-What Network 1: "Where" and "What" Assist Each Other Through Top-down Connections
    Ji, Zhengping
    Weng, Juyang
    Prokhorov, Danil
    2008 IEEE 7TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, 2008, : 61 - +
  • [6] COORDINATION OF WHAT AND WHERE IN VISUAL-ATTENTION
    DUNCAN, J
    PERCEPTION, 1993, 22 (11) : 1261 - 1270
  • [7] WWN-2: A Biologically Inspired Neural Network for Concurrent Visual Attention and Recognition
    Ji, Zhengping
    Weng, Juyang
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [8] What/Where to Look Next? Modeling Top-Down Visual Attention in Complex Interactive Environments
    Borji, Ali
    Sihite, Dicky N.
    Itti, Laurent
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2014, 44 (05): : 523 - 538
  • [9] Visual attention: the where, what, how and why of saliency
    Treue, S
    CURRENT OPINION IN NEUROBIOLOGY, 2003, 13 (04) : 428 - 432
  • [10] Knowing what and where: A computational model for visual attention
    Chokshi, K
    Panchev, C
    Wermter, S
    Taylor, JG
    2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 519 - 524