Multimodal Across Domains Gaze Target Detection

被引:2
|
作者
Tonini, Francesco [1 ]
Beyan, Cigdem [1 ]
Ricci, Elisa [1 ,2 ]
机构
[1] Univ Trento, Dept Informat Engn & Comp Sci, Trento, Italy
[2] Fdn Bruno Kessler, Deep Visual Learning Res Grp, Trento, Italy
关键词
Gaze target detection; gaze following; domain adaptation; RGB image; depth map; multimodal data;
D O I
10.1145/3536221.3556624
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the gaze target detection problem in single images captured from the third-person perspective. We present a multimodal deep architecture to infer where a person in a scene is looking. This spatial model is trained on the head images of the person-of-interest, scene and depth maps representing rich context information. Our model, unlike several prior art, do not require supervision of the gaze angles, do not rely on head orientation information and/or location of the eyes of person-of-interest. Extensive experiments demonstrate the stronger performance of our method on multiple benchmark datasets. We also investigated several variations of our method by altering joint-learning of multimodal data. Some variations outperform a few prior art as well. First time in this paper, we inspect domain adaptation for gaze target detection, and we empower our multimodal network to efectively handle the domain gap across datasets. The code of the proposed method is available at https://github.com/francescotonini/multimodal- across-domains-gaze-target-detection.
引用
收藏
页码:420 / 431
页数:12
相关论文
共 50 条
  • [1] Learning Gaze Transition for Gaze Target Detection in Video
    Yang, Xingming
    Shi, Junbiao
    Li, Ziqiang
    Wu, Kewei
    Xie, Zhao
    [J]. Computer Engineering and Applications, 2024, 60 (20) : 293 - 301
  • [2] Target Detection and Gaze Control with Reduced Acuity
    Freedman, Andrew Carter
    Achtemeier, Jacob
    Baek, Yihwa
    Legge, Gordon E.
    [J]. INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2018, 59 (09)
  • [3] Object-aware Gaze Target Detection
    Tonini, Francesco
    Dall'Asen, Nicola
    Beyan, Cigdem
    Ricci, Elisa
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21803 - 21812
  • [4] Improving Stability of Gaze Target Detection in Videos
    Yang, Zhihao
    Wang, Xinming
    Wang, Zhiyong
    Xu, Qiong
    Xu, Xiu
    Liu, Honghai
    [J]. IECON Proceedings (Industrial Electronics Conference), 2023,
  • [5] Point Target Detection for Multimodal Communication
    VanderHoeven, Hannah
    Blanchard, Nathaniel
    Krishnaswamy, Nikhil
    [J]. DIGITAL HUMAN MODELING AND APPLICATIONS IN HEALTH, SAFETY, ERGONOMICS AND RISK MANAGEMENT, PT I, DHM 2024, 2024, 14709 : 356 - 373
  • [6] MMConv: An Environment for Multimodal Conversational Search across Multiple Domains
    Liao, Lizi
    Long, Le Hong
    Zhang, Zheng
    Huang, Minlie
    Chua, Tat-Seng
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 675 - 684
  • [7] Dual Attention Guided Gaze Target Detection in the Wild
    Fang, Yi
    Tang, Jiapeng
    Shen, Wang
    Shen, Wei
    Gu, Xiao
    Song, Li
    Zhai, Guangtao
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11385 - 11394
  • [8] Multimodal Target Detection via Integrated GLRT
    Kay, Steven
    Cogun, Fuat
    [J]. 2015 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM 2015), 2015, : 199 - 203
  • [9] A Modular Multimodal Architecture for Gaze Target Prediction: Application to Privacy-Sensitive Settings
    Gupta, Anshul
    Tafasca, Samy
    Odobez, Jean-Marc
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 5037 - 5046
  • [10] SALIENCY DETECTION ACROSS SPATIAL AND FREQUENCY DOMAINS
    Wei, Jianhuan
    Zhong, Baojiang
    [J]. 2017 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2017), 2017, : 347 - 352