CNN-based search model fails to account for human attention guidance by simple visual features

被引：0

作者：

Endel Põder

机构：

[1] University of Tartu,Institute of Psychology

来源：

Attention, Perception, & Psychophysics | 2024年 / 86卷

关键词：

Models of attention; Neural network modeling; Visual search - Simple visual features - Feature conjunctions;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recently, Zhang et al. (Nature communications, 9(1), 3730, 2018) proposed an interesting model of attention guidance that uses visual features learnt by convolutional neural networks (CNNs) for object classification. I adapted this model for search experiments, with accuracy as the measure of performance. Simulation of our previously published feature and conjunction search experiments revealed that the CNN-based search model proposed by Zhang et al. considerably underestimates human attention guidance by simple visual features. Using target-distractor differences instead of target features for attention guidance or computing attention map at lower layers of the network could improve the performance. Still, the model fails to reproduce qualitative regularities of human visual search. The most likely explanation is that standard CNNs that are trained on image classification have not learnt medium- or high-level features required for human-like attention guidance.

引用

页码：9 / 15

页数：6

共 50 条

[21] A Visual Attention Model for Dynamic Scenes Based on Motion Features
Zhou Changle
Chen Jiawei
Yao Jinliang
2012 12TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS & VISION (ICARCV), 2012, : 1397 - 1401
[22] Salient Locations Search Based on Human Visual Attention: An Experimental Analysis
Hu, Wenting
Yang, Pei
Zhou, Xianzhong
Liu, Zhen
Li, Huaxiong
Zhu, Xianjun
PROCEEDINGS OF THE 2017 IEEE 14TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC 2017), 2017, : 649 - 654
[23] A model of spatial and object-based attention for active visual search
Lanyon, L
Denham, S
MODELING LANGUAGE, COGNITION AND ACTION, 2005, 16 : 239 - 248
[24] Human action recognition using attention based LSTM network with dilated CNN features
Muhammad, Khan
Mustaqeem
Ullah, Amin
Imran, Ali Shariq
Sajjad, Muhammad
Kiran, Mustafa Servet
Sannino, Giovanna
de Albuquerque, Victor Hugo C.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 125 : 820 - 830
[25] TGAN: A simple model update strategy for visual tracking via template-guidance attention network
Yang, Kai
Zhang, Haijun
Zhou, Dongliang
Liu, Linlin
NEURAL NETWORKS, 2021, 144 : 61 - 74
[26] 3D model retrieval based on interactive attention CNN and multiple features
Gao, Xue-Yao
Jia, Wen-Hui
Zhang, Chun-Xiang
PEERJ COMPUTER SCIENCE, 2023, 9 : 1 - 19
[27] The Architecture of Working Memory: Features From Multiple Remembered Objects Produce Parallel, Coactive Guidance of Attention in Visual Search
Bahle, Brett
Thayer, Daniel D.
Mordkoff, J. Toby
Hollingworth, Andrew
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2020, 149 (05) : 967 - 983
[28] Integration of a CNN-based model and ensemble learning for detecting post-earthquake road cracks with deep features
Reis, Hatice Catal
Turk, Veysel
Karacur, Soner
Kurt, Ahmet Melih
STRUCTURES, 2024, 62
[29] Face Detection and Recognition Based on Visual Attention Mechanism Guidance Model in Unrestricted Posture
Yuan, Zhenguo
SCIENTIFIC PROGRAMMING, 2020, 2020
[30] A human visual model-based approach of the visual attention and performance evaluation
Le Meur, O
Barba, D
Le Callet, P
Thoreau, D
HUMAN VISION AND ELECTRONIC IMAGING X, 2005, 5666 : 258 - 267

← 1 2 3 4 5 →