Attention Flow: End-to-End Joint Attention Estimation

被引：0

作者：

Sumer, Omer ^{[1
]}

Gerjets, Peter ^{[2
]}

Trautwein, Ulrich ^{[1
]}

Kasneci, Enkelejda ^{[1
]}

机构：

[1] Univ Tubingen, Tubingen, Germany

[2] Leibniz Inst Wissensmedien, Tubingen, Germany

来源：

2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2020年

关键词：

CHILDREN; MODEL;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper addresses the problem of understanding joint attention in third-person social scene videos. Joint attention is the shared gaze behaviour of two or more individuals on an object or an area of interest and has a wide range of applications such as human-computer interaction, educational assessment, treatment of patients with attention disorders, and many more. Our method, Attention Flow, learns joint attention in an end-to-end fashion by using saliency-augmented attention maps and two novel convolutional attention mechanisms that determine to select relevant features and improve joint attention localization. We compare the effect of saliency maps and attention mechanisms and report quantitative and qualitative results on the detection and localization of joint attention in the VideoCoAtt dataset, which contains complex social scenes.

引用

页码：3316 / 3325

页数：10

共 50 条

[31] Improving End-to-End SLU performance with Prosodic Attention and Distillation
Rajaa, Shangeth
INTERSPEECH 2023, 2023, : 1114 - 1118
[32] SWINBERT: End-to-End Transformers with Sparse Attention for Video Captioning
Lin, Kevin
Li, Linjie
Lin, Chung-Ching
Ahmed, Faisal
Gan, Zhe
Liu, Zicheng
Lu, Yumao
Wang, Lijuan
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17928 - 17937
[33] Self-Attention Transducers for End-to-End Speech Recognition
Tian, Zhengkun
Yi, Jiangyan
Tao, Jianhua
Bai, Ye
Wen, Zhengqi
INTERSPEECH 2019, 2019, : 4395 - 4399
[34] End-to-End Chinese Image Text Recognition with Attention Model
Sheng, Fenfen
Zhai, Chuanlei
Chen, Zhineng
Xu, Bo
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 180 - 189
[35] STRUCTURED SPARSE ATTENTION FOR END-TO-END AUTOMATIC SPEECH RECOGNITION
Xue, Jiabin
Zheng, Tieran
Han, Jiqing
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7044 - 7048
[36] END-TO-END NEURAL SPEAKER DIARIZATION WITH SELF-ATTENTION
Fujita, Yusuke
Kanda, Naoyuki
Horiguchi, Shota
Xue, Yawen
Nagamatsu, Kenji
Watanabe, Shinji
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 296 - 303
[37] A Novel End-to-End Image Caption Based on Multimodal Attention
Li X.-M.
Yue G.
Chen G.-W.
Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2020, 49 (06): : 867 - 874
[38] Improved training of end-to-end attention models for speech recognition
Zeyer, Albert
Irie, Kazuki
Schlueter, Ralf
Ney, Hermann
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 7 - 11
[39] End-to-End Point Cloud Completion Network with Attention Mechanism
Li, Yaqin
Han, Binbin
Zeng, Shan
Xu, Shengyong
Yuan, Cao
SENSORS, 2022, 22 (17)
[40] An End-to-End Lane Detection Model with Attention and Residual Block
Wang, Bo
Yan, Xiaoting
Li, Deguang
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022

← 1 2 3 4 5 →