Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

被引:0
|
作者
Zhu, Yaoming [1 ]
Sun, Zewei [1 ]
Cheng, Shanbo [1 ]
Huang, Luyang [1 ]
Wu, Liwei [1 ]
Wang, Mingxuan [1 ]
机构
[1] ByteDance, Shenzhen, Peoples R China
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent work has questioned the necessity of visual information in Multimodal Machine Translation (MMT). This paper tries to answer this question and build a new benchmark in this work. As the available dataset is simple and the text input is self-sufficient, we introduce a challenging dataset called EMMT, whose test-set is deliberately designed to ensure ambiguity. More importantly, we study this problem in a real-word scenario towards making the most of multimodal training data. We propose a new framework 2/3-Triplet which can naturally make full use of large-scale image-text and parallel text-only data. Extensive experiments show that visual information is highly crucial in EMMT. The proposed 2/3-Triplet outperforms the strong text-only competitor by 3.8 BLEU score, and even bypasses a commercial translation system. (1)
引用
收藏
页码:2679 / 2697
页数:19
相关论文
共 50 条
  • [41] Leveraging Online User Feedback to Improve Statistical Machine Translation
    Formiga, Lluis
    Barron-Cedeno, Alberto
    Marquez, Lluis
    Henriquez, Carlos A.
    Marino, Jose B.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2015, 54 : 159 - 192
  • [42] Leveraging online user feedback to improve statistical machine translation
    20154201387337
    1600, AI Access Foundation (54):
  • [43] Leveraging bilingual terminology to improve machine translation in a CAT environment
    Arcan, Mihael
    Turchi, Marco
    Tonelli, Sara
    Buitelaar, Paul
    NATURAL LANGUAGE ENGINEERING, 2017, 23 (05) : 763 - 788
  • [44] Towards Making the Most of Context in Neural Machine Translation
    Zheng, Zaixiang
    Yue, Xiang
    Huang, Shujian
    Chen, Jiajun
    Birch, Alexandra
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3983 - 3989
  • [45] Towards Making the Most of BERT in Neural Machine Translation
    Yang, Jiacheng
    Wang, Mingxuan
    Zhou, Hao
    Zhao, Chengqi
    Yu, Yong
    Zhang, Weinan
    Li, Lei
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9378 - 9385
  • [46] Machine Translation Decoding beyond Beam Search
    Leblond, Itemi
    Alayrac, Jean-Baptiste
    Sifre, Laurent
    Pislar, Miruna
    Lespiau, Jean-Baptiste
    Antonoglou, Ioannis
    Simonyan, Karen
    Vinyals, Oriol
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8410 - 8434
  • [47] Beyond Order: Perspectives on Leveraging Machine Learning for Disordered Materials
    Sarvestani, Hamidreza Yazdani
    Nadigotti, Surabhi
    Fatehi, Erfan
    van Egmond, Derek Aranguren
    Ashrafi, Behnam
    ADVANCED ENGINEERING MATERIALS, 2025,
  • [48] MULTIMODALITY AND EVALUATION OF MACHINE TRANSLATION: A PROPOSAL FOR INVESTIGATING INTERSEMIOTIC MISMATCHES GENERATED BY THE USE OF MACHINE TRANSLATION IN MULTIMODAL DOCUMENTS
    Pires, Thiago Blanch
    TEXTO LIVRE-LINGUAGEM E TECNOLOGIA, 2018, 11 (01): : 82 - 102
  • [49] Creating a Multimodal Translation Tool and Testing Machine Translation Integration Using Touch and Voice
    Teixeira, Carlos S. C.
    Moorkens, Joss
    Turner, Daniel
    Vreeke, Joris
    Way, Andy
    INFORMATICS-BASEL, 2019, 6 (01):
  • [50] A Visual Attention Grounding Neural Model for Multimodal Machine Translation
    Zhou, Mingyang
    Cheng, Runxiang
    Lee, Yong Jae
    Yu, Zhou
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3643 - 3653