Simulating Human Visual System Based on Vision Transformer

被引:1
|
作者
Qiu, Mengyu [1 ]
Guo, Yi [2 ]
Zhang, Mingguang [1 ]
Zhang, Jingwei [1 ]
Lan, Tian [1 ]
Liu, Zhilin [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
[2] Chinese Acad Sci Xian, Xian Inst Opt & Precis Mech, Xian, Peoples R China
关键词
Visual scanpath prediction; fixation duration prediction; saccade Sequences; visual attention; scene analysis; EYE-MOVEMENTS; MODEL;
D O I
10.1145/3607822.3616408
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The human visual system (HVS) is capable of responding in real-time to complex visual environments. During the process of freely observing visual scenes, predicting eye movements and visual fixations is a task known as scanpath prediction, which aims to simulate the HVS. In this paper, we propose a visual transformer-based model to study the attentional processes of the human visual system in analyzing visual scenes, thereby achieving scanpath prediction. This technology has important applications in human-computer interaction, virtual reality, augmented reality, and other fields. We have significantly simplified the workflow of scanpath prediction and the overall model architecture, achieving performance superior to existing methods.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] VISION AND TEXT TRANSFORMER FOR PREDICTING ANSWERABILITY ON VISUAL QUESTION ANSWERING
    Le, Tung
    Huy Tien Nguyen
    Minh Le Nguyen
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 934 - 938
  • [32] A New Mobile Visual Search System Based on the Human Visual System
    Zhao, Bo
    Zhao, Hongwei
    Liu, Pingping
    Qin, Guihe
    ADVANCES IN BIONIC ENGINEERING, 2014, 461 : 792 - +
  • [33] Representation Learning Based on Vision Transformer
    Ran, Ruisheng
    Gao, Tianyu
    Hu, Qianwei
    Zhang, Wenfeng
    Peng, Shunshun
    Fang, Bin
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (07)
  • [34] Adaptively bypassing vision transformer blocks for efficient visual tracking
    Yang, Xiangyang
    Zeng, Dan
    Wang, Xucheng
    Wu, You
    Ye, Hengzhou
    Zhao, Qijun
    Li, Shuiwang
    PATTERN RECOGNITION, 2025, 161
  • [35] A vision transformer-based automated human identification using ear biometrics
    Mehta, Ravishankar
    Shukla, Sindhuja
    Pradhan, Jitesh
    Singh, Koushlendra Kumar
    Kumar, Abhinav
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2023, 78
  • [36] NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition
    Liu, Hao
    Jiang, Xinghua
    Li, Xin
    Bao, Zhimin
    Jiang, Deqiang
    Ren, Bo
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12063 - 12072
  • [37] An Face-based Visual Fixation System for Prosthetic Vision
    He, Xuming
    Kim, Junae
    Barnes, Nick
    2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 2981 - 2984
  • [38] ARM-Based Visual Processing System for Prosthetic Vision
    Matteucci, Paul. B.
    Byrnes-Preston, Philip
    Chen, Spencer C.
    Lovell, Nigel H.
    Suaning, Gregg J.
    2011 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2011, : 3921 - 3924
  • [39] An Image Processing Based Visual Compensation System for Vision Defects
    Lai, Chin-Lun
    Chang, Shu-Wen
    ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, 2009, : 81 - +
  • [40] Visual Location System for Placement Machine based on Machine Vision
    Wei, Luosi
    Rao, Zongxia
    SEC 2008: PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL SYMPOSIUM ON EMBEDDED COMPUTING, 2008, : 141 - 146