A Visual Attention Grounding Neural Model for Multimodal Machine Translation

被引:0
|
作者
Zhou, Mingyang [1 ]
Cheng, Runxiang [1 ]
Lee, Yong Jae [1 ]
Yu, Zhou [1 ]
机构
[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a novel multimodal machine translation model that utilizes parallel visual and textual information. Our model jointly optimizes the learning of a shared visual-language embedding and a translator. The model leverages a visual attention grounding mechanism that links the visual semantics with the corresponding textual semantics. Our approach achieves competitive state-of-the-art results on the Multi30K and the Ambiguous COCO datasets. We also collected a new multilingual multimodal product description dataset to simulate a real-world international online shopping scenario. On this dataset, our visual attention grounding model outperforms other methods by a large margin.
引用
收藏
页码:3643 / 3653
页数:11
相关论文
共 50 条
  • [41] Training Deeper Neural Machine Translation Models with Transparent Attention
    Bapna, Ankur
    Chen, Mia Xu
    Firat, Orhan
    Cao, Yuan
    Wu, Yonghui
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3028 - 3033
  • [42] Recursive Annotations for Attention-Based Neural Machine Translation
    Ye, Shaolin
    Guo, Wu
    [J]. 2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 164 - 167
  • [43] Fine-grained attention mechanism for neural machine translation
    Choi, Heeyoul
    Cho, Kyunghyun
    Bengio, Yoshua
    [J]. NEUROCOMPUTING, 2018, 284 : 171 - 176
  • [44] Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation
    Moradi, Pooya
    Kambhatla, Nishant
    Sarkar, Anoop
    [J]. AACL-IJCNLP 2020: THE 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2020, : 86 - 93
  • [45] Towards Understanding Neural Machine Translation with Attention Heads' Importance
    Zhou, Zijie
    Zhu, Junguo
    Li, Weijiang
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [46] Syntax-Based Attention Masking for Neural Machine Translation
    McDonald, Colin
    Chiang, David
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 47 - 52
  • [47] Look-Ahead Attention for Generation in Neural Machine Translation
    Zhou, Long
    Zhang, Jiajun
    Zong, Chengqing
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 211 - 223
  • [48] Selective Attention for Context-aware Neural Machine Translation
    Maruf, Sameen
    Martins, Andre F. T.
    Haffari, Gholamreza
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3092 - 3102
  • [49] Visual Grounding With Joint Multimodal Representation and Interaction
    Zhu, Hong
    Lu, Qingyang
    Xue, Lei
    Xue, Mogen
    Yuan, Guanglin
    Zhong, Bineng
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72 : 1 - 11
  • [50] Enhancing Neural Machine Translation With Dual-Side Multimodal Awareness
    Song, Yuqing
    Chen, Shizhe
    Jin, Qin
    Luo, Wei
    Xie, Jun
    Huang, Fei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 3013 - 3024