A Visual Attention Grounding Neural Model for Multimodal Machine Translation

被引：0

作者：

Zhou, Mingyang ^{[1
]}

Cheng, Runxiang ^{[1
]}

Lee, Yong Jae ^{[1
]}

Yu, Zhou ^{[1
]}

机构：

[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA

来源：

2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce a novel multimodal machine translation model that utilizes parallel visual and textual information. Our model jointly optimizes the learning of a shared visual-language embedding and a translator. The model leverages a visual attention grounding mechanism that links the visual semantics with the corresponding textual semantics. Our approach achieves competitive state-of-the-art results on the Multi30K and the Ambiguous COCO datasets. We also collected a new multilingual multimodal product description dataset to simulate a real-world international online shopping scenario. On this dataset, our visual attention grounding model outperforms other methods by a large margin.

引用

页码：3643 / 3653

页数：11

共 50 条

[1] Supervised Visual Attention for Simultaneous Multimodal Machine Translation
Haralampieva, Veneta
Caglayan, Ozan
Specia, Lucia
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 74 : 1059 - 1089
[2] Supervised Visual Attention for Simultaneous Multimodal Machine Translation
Haralampieva, Veneta
Caglayan, Ozan
Specia, Lucia
[J]. Journal of Artificial Intelligence Research, 2022, 74 : 1059 - 1089
[3] A text-based visual context modulation neural model for multimodal machine translation
Kwon, Soonmo
Go, Byung-Hyun
Lee, Jong-Hyeok
[J]. PATTERN RECOGNITION LETTERS, 2020, 136 : 212 - 218
[4] Bilingual-Visual Consistency for Multimodal Neural Machine Translation
Liu, Yongwen
Liu, Dongqing
Zhu, Shaolin
[J]. MATHEMATICS, 2024, 12 (15)
[5] Neural Machine Translation with Target-Attention Model
Yang, Mingming
Zhang, Min
Chen, Kehai
Wang, Rui
Zhao, Tiejun
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (03) : 684 - 694
[6] Look Harder: A Neural Machine Translation Model with Hard Attention
Indurthi, Sathish
Chung, Insoo
Kim, Sangha
[J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3037 - 3043
[7] Neural Machine Translation With GRU-Gated Attention Model
Zhang, Biao
Xiong, Deyi
Xie, Jun
Su, Jinsong
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
[8] Recurrent Attention for Neural Machine Translation
Zeng, Jiali
Wu, Shuangzhi
Yin, Yongjing
Jiang, Yufan
Li, Mu
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
[9] Neural Machine Translation with Deep Attention
Zhang, Biao
Xiong, Deyi
Su, Jinsong
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163
[10] Attention-via-Attention Neural Machine Translation
Zhao, Shenjian
Zhang, Zhihua
[J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 563 - 570

← 1 2 3 4 5 →