Multi-interaction Network with Object Relation for Video Question Answering

被引：50

作者：

Jin, Weike ^{[1
]}

Zhao, Zhou ^{[1
]}

Gu, Mao ^{[1
]}

Yu, Jun ^{[2
]}

Xiao, Jun ^{[1
]}

Zhuang, Yueting ^{[1
]}

机构：

[1] Zhejiang Univ, Hangzhou, Peoples R China

[2] Hangzhou Dianzi Univ, Hangzhou, Peoples R China

来源：

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19) | 2019年

基金：

中国国家自然科学基金; 浙江省自然科学基金;

关键词：

video question answering; multi-interaction; object relation;

D O I：

10.1145/3343031.3351065

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Video question answering is an important task for testing machine's ability of video understanding. The existing methods normally focus on the combination of recurrent and convolutional neural networks to capture spatial and temporal information of the video. Recently, some work has also shown that using attention mechanism can achieve better performance. In this paper, we propose a new model called Multi-interaction network for video question answering. There are two types of interactions in our model. The first type is the multi-modal interaction between the visual and textual information. The second type is the multi-level interaction inside the multi-modal interaction. Specifically, instead of using original self-attention, we propose a new attention mechanism called multi-interaction, which can capture both element-wise and segment-wise sequence interactions, simultaneously. And in addition to the normal frame-level interaction, we also take the object relations into consideration, in order to obtain more fine-grained information, such as motions and other potential relations among these objects. We evaluate our method on TGIF-QA and other two video QA datasets. The qualitative and quantitative experimental results show the effectiveness of our model, which achieves the new state-of-the-art performance.

引用

页码：1193 / 1201

页数：9

共 50 条

[21] Progressive Graph Attention Network for Video Question Answering
Peng, Liang
Yang, Shuangji
Bin, Yi
Wang, Guoqing
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2871 - 2879
[22] Video Question Answering Using a Forget Memory Network
Ge, Yuanyuan
Xu, Youjiang
Han, Yahong
COMPUTER VISION, PT I, 2017, 771 : 404 - 415
[23] Hierarchical Conditional Relation Networks for Multimodal Video Question Answering
Le, Thao Minh
Le, Vuong
Venkatesh, Svetha
Tran, Truyen
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (11) : 3027 - 3050
[24] Hierarchical Conditional Relation Networks for Multimodal Video Question Answering
Thao Minh Le
Vuong Le
Svetha Venkatesh
Truyen Tran
International Journal of Computer Vision, 2021, 129 : 3027 - 3050
[25] Knowledge Graph Relation Path Network for Multi-Hop Intelligent Question Answering
Zhang Y.-M.
Ji Q.
Xu X.-S.
Cheng Z.-B.
Xiao G.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (11): : 3092 - 3099
[26] Question-aware memory network for multi-hop question answering in human–robot interaction
Xinmeng Li
Mamoun Alazab
Qian Li
Keping Yu
Quanjun Yin
Complex & Intelligent Systems, 2022, 8 : 851 - 861
[27] Video Question Answering With Prior Knowledge and Object-Sensitive Learning
Zeng, Pengpeng
Zhang, Haonan
Gao, Lianli
Song, Jingkuan
Shen, Heng Tao
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5936 - 5948
[28] Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Chao, Guan-Lin
Rastogi, Abhinav
Yavuz, Semih
Hakkani-Tur, Dilek
Chen, Jindong
Lane, Ian
20TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2019), 2019, : 215 - 225
[29] Question-Aware Tube-Switch Network for Video Question Answering
Yang, Tianhao
Zha, Zheng-Jun
Xie, Hongtao
Wang, Meng
Zhang, Hanwang
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1184 - 1192
[30] MINE: A method of multi-interaction heterogeneous information network embedding
Zhu D.
Sun Y.
Li X.
Du H.
Qu R.
Yu P.
Piao X.
Higgs R.
Cao N.
Yu, Pingping (yppflx@hotmail.com), 2020, Tech Science Press (63): : 1343 - 1356

← 1 2 3 4 5 →