The attentive reconstruction of objects facilitates robust object recognition

被引:0
|
作者
Ahn, Seoyoung [1 ]
Adeli, Hossein [2 ]
Zelinsky, Gregory J. [3 ,4 ]
机构
[1] Univ Calif Berkeley, Dept Mol & Cell Biol, Berkeley, CA 94720 USA
[2] Columbia Univ, Zuckerman Mind Brain Behav Inst, New York, NY USA
[3] SUNY Stony Brook, Dept Psychol, Stony Brook, NY USA
[4] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
关键词
TOP-DOWN FACILITATION; VISUAL-ATTENTION; NEURAL MECHANISMS; BAYESIAN-INFERENCE; PERCEPTION; MODEL; SHAPE; INTEGRATION; SPOTLIGHT; NETWORKS;
D O I
10.1371/journal.pcbi.1012159
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Humans are extremely robust in our ability to perceive and recognize objects-we see faces in tea stains and can recognize friends on dark streets. Yet, neurocomputational models of primate object recognition have focused on the initial feed-forward pass of processing through the ventral stream and less on the top-down feedback that likely underlies robust object perception and recognition. Aligned with the generative approach, we propose that the visual system actively facilitates recognition by reconstructing the object hypothesized to be in the image. Top-down attention then uses this reconstruction as a template to bias feedforward processing to align with the most plausible object hypothesis. Building on auto-encoder neural networks, our model makes detailed hypotheses about the appearance and location of the candidate objects in the image by reconstructing a complete object representation from potentially incomplete visual input due to noise and occlusion. The model then leverages the best object reconstruction, measured by reconstruction error, to direct the bottom-up process of selectively routing low-level features, a top-down biasing that captures a core function of attention. We evaluated our model using the MNIST-C (handwritten digits under corruptions) and ImageNet-C (real-world objects under corruptions) datasets. Not only did our model achieve superior performance on these challenging tasks designed to approximate real-world noise and occlusion viewing conditions, but also better accounted for human behavioral reaction times and error patterns than a standard feedforward Convolutional Neural Network. Our model suggests that a complete understanding of object perception and recognition requires integrating top-down and attention feedback, which we propose is an object reconstruction. Humans can dream and imagine things, and this means that the human brain can generate perceptions of things that are not there. We propose that humans evolved this generation capability, not solely to have more vivid dreams, but to help us better understand the world, especially when what we see is unclear or missing some details (due to occlusion, changing perspective, etc.). Through a combination of computational modeling and behavioral experiments, we demonstrate how the process of generating objects-actively reconstructing the most plausible object representation from noisy visual input-guides attention towards specific features or locations within an image (known as functions of top-down attention), thereby enhancing the system's robustness to various types of noise and corruption. We found that this generative attention mechanism could explain, not only the time that it took people to recognize challenging objects, but also the types of recognition errors made by people (seeing an object as one thing when it was really another). These findings contribute to a deeper understanding of the computational mechanisms of attention in the brain and their potential connection to the generative processes that facilitate robust object recognition.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] ROBUST OBJECT RECONSTRUCTION FROM NOISY OBSERVATIONS
    SUNDARAMOORTHY, G
    RAGHUVEER, MR
    DIANAT, SA
    [J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING IV, PTS 1-3, 1989, 1199 : 104 - 114
  • [32] Active manual control of object views facilitates visual recognition
    Harman, KL
    Humphrey, GK
    Goodale, MA
    [J]. CURRENT BIOLOGY, 1999, 9 (22) : 1315 - 1318
  • [33] Urocanic acid facilitates acquisition of object recognition memory in mice
    Wang, Le
    Tan, Yinna
    Wang, Hao
    Yu, Xu-Dong
    Mo, Yanxin
    Reilly, James
    He, Zhiming
    Shu, Xinhua
    [J]. PHYSIOLOGY & BEHAVIOR, 2023, 266
  • [34] Smart objects for attentive agents
    Peters, C
    Dobbyn, S
    Mac Namee, B
    O'Sullivan, C
    [J]. WSCG 2003 SHORT PAPERS, PROCEEDINGS, 2003, : 111 - 118
  • [35] Applying distance histograms for robust object recognition
    Arques, Pilar
    Pujol, Francisco A.
    Llorens, Faran
    Pujol, Mar
    Rizo, Ramon
    [J]. KYBERNETES, 2007, 36 (1-2) : 42 - 51
  • [36] A Robust Object Recognition Method for Soccer Robots
    Lu, Huimin
    Zheng, Zhiqiang
    Liu, Fei
    Wang, Xiangke
    [J]. 2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 1645 - 1650
  • [37] Object Recognition Robust to Imperfect Depth Data
    Fouhey, David F.
    Collet, Alvaro
    Hebert, Martial
    Srinivasa, Siddhartha
    [J]. COMPUTER VISION - ECCV 2012, PT II, 2012, 7584 : 83 - 92
  • [38] Robust tensor classifiers for color object recognition
    Bauckhage, Christian
    [J]. IMAGE ANALYSIS AND RECOGNITION, PROCEEDINGS, 2007, 4633 : 352 - 363
  • [39] A descriptive language for flexible and robust object recognition
    Lovell, N
    Estivill-Castro, V
    [J]. ROBOCUP 2004: ROBOT SOCCER WORLD CUP VIII, 2005, 3276 : 540 - 547
  • [40] Robust Local Descriptor for Color Object Recognition
    Hamdini, Rabah
    Diffellah, Nacira
    Namane, Abderrahmane
    [J]. TRAITEMENT DU SIGNAL, 2019, 36 (06) : 471 - 482