Multi-aware coreference relation network for visual dialog

被引：0

作者：

Zefan Zhang

Tianling Jiang

Chunping Liu

Yi Ji

机构：

[1] Soochow University,School of Computer Science and Technology

来源：

International Journal of Multimedia Information Retrieval | 2022年 / 11卷

关键词：

Visual dialog; Multimedia; Coreference resolution; Cross-modal relationships;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

As a challenging cross-media task, visual dialog assesses whether an AI agent can converse in human language based on its understanding of visual content. So the critical issue is to pay attention not only to the problem of coreference in vision, but also to the problem of coreference in and between vision and language. In this paper, we propose the multi-aware coreference relation network (MACR-Net) to solve it from both textual and visual perspectives and to do fusion in complementary awareness. Specifically, its textual coreference relation module identifies textual coreference relations based on multi-aware textual representation from textual view. Furthermore, the visual coreference relation module adaptively adjusts visual coreference relations based on contextual-aware relations representation from visual view. Finally, the multi-modals fusion module fuses multi-aware relations to get an aligned representation. Extensive experiments on the VisDial v1.0 benchmarks show that MACR-Net achieves state-of-the-art performance.

引用

页码：567 / 576

页数：9

共 50 条

[1] Multi-aware coreference relation network for visual dialog
Zhang, Zefan
Jiang, Tianling
Liu, Chunping
Ji, Yi
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2022, 11 (04) : 567 - 576
[2] A multi-aware graph convolutional network for driver drowsiness detection
Lin, Liang
Wang, Song
Yang, Jucheng
Wei, Feng
KNOWLEDGE-BASED SYSTEMS, 2024, 305
[3] Modeling Coreference Relations in Visual Dialog
Li, Mingxiao
Moens, Marie-Francine
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3306 - 3318
[4] GoG: Relation-aware Graph-over-Graph Network for Visual Dialog
Chen, Feilong
Chen, Xiuyi
Meng, Fandong
Li, Peng
Zhou, Jie
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 230 - 243
[5] Knowledge-Aware Causal Inference Network for Visual Dialog
Zhang, Zefan
Liu, Chunping
Ji, Yi
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 253 - 261
[6] Visual Coreference Resolution in Visual Dialog Using Neural Module Networks
Kottur, Satwik
Moura, Jose M. F.
Parikh, Devi
Batra, Dhruv
Rohrbach, Marcus
COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 160 - 178
[7] Textual-Visual Reference-Aware Attention Network for Visual Dialog
Guo, Dan
Wang, Hui
Wang, Shuhui
Wang, Meng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 6655 - 6666
[8] Textual-Visual Reference-Aware Attention Network for Visual Dialog
School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China
不详
IEEE Trans Image Process, 2020, (6655-6666):
[9] Multi-View Attention Network for Visual Dialog
Park, Sungjin
Whang, Taesun
Yoon, Yeochan
Lim, Heuiseok
APPLIED SCIENCES-BASEL, 2021, 11 (07):
[10] VD-PCR: Improving visual dialog with pronoun coreference resolution
Yu, Xintong
Zhang, Hongming
Hong, Ruixin
Song, Yangqiu
Zhang, Changshui
PATTERN RECOGNITION, 2022, 125

← 1 2 3 4 5 →