Attention Flow: End-to-End Joint Attention Estimation

被引:0
|
作者
Sumer, Omer [1 ]
Gerjets, Peter [2 ]
Trautwein, Ulrich [1 ]
Kasneci, Enkelejda [1 ]
机构
[1] Univ Tubingen, Tubingen, Germany
[2] Leibniz Inst Wissensmedien, Tubingen, Germany
关键词
CHILDREN; MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problem of understanding joint attention in third-person social scene videos. Joint attention is the shared gaze behaviour of two or more individuals on an object or an area of interest and has a wide range of applications such as human-computer interaction, educational assessment, treatment of patients with attention disorders, and many more. Our method, Attention Flow, learns joint attention in an end-to-end fashion by using saliency-augmented attention maps and two novel convolutional attention mechanisms that determine to select relevant features and improve joint attention localization. We compare the effect of saliency maps and attention mechanisms and report quantitative and qualitative results on the detection and localization of joint attention in the VideoCoAtt dataset, which contains complex social scenes.
引用
收藏
页码:3316 / 3325
页数:10
相关论文
共 50 条
  • [1] Joint CTC/attention decoding for end-to-end speech recognition
    Hori, Takaaki
    Watanabe, Shinji
    Hershey, John R.
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 518 - 529
  • [2] End-to-end Flow Correlation Tracking with Spatial-temporal Attention
    Zhu, Zheng
    Wu, Wei
    Zou, Wei
    Yan, Junjie
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 548 - 557
  • [3] SUPPORTIVE ATTENTION IN END-TO-END MEMORY NETWORKS
    Chien, Jen-Tzung
    Lin, Ting-An
    2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,
  • [4] End-to-End Instance Segmentation with Recurrent Attention
    Ren, Mengye
    Zemel, Richard S.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 293 - 301
  • [5] An End-to-End TextSpotter with Explicit Alignment and Attention
    He, Tong
    Tian, Zhi
    Huang, Weilin
    Shen, Chunhua
    Qiao, Yu
    Sun, Changming
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5020 - 5029
  • [6] Guiding Attention in End-to-End Driving Models
    Porres, Diego
    Xiao, Yi
    Villalonga, Gabriel
    Levy, Alexandre
    Lopez, Antonio M.
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 2353 - 2360
  • [7] TRIGGERED ATTENTION FOR END-TO-END SPEECH RECOGNITION
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5666 - 5670
  • [8] Investigating Joint CTC-Attention Models for End-to-End Russian Speech Recognition
    Markovnikov, Nikita
    Kipyatkova, Irina
    SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 337 - 347
  • [9] STREAMING END-TO-END SPEECH RECOGNITION WITH JOINT CTC-ATTENTION BASED MODELS
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 936 - 943
  • [10] AN ATTENTION-BASED JOINT ACOUSTIC AND TEXT ON-DEVICE END-TO-END MODEL
    Sainath, Tara N.
    Pang, Ruoming
    Weiss, Ron J.
    He, Yanzhang
    Chiu, Chung-cheng
    Strohman, Trevor
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7039 - 7043