Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

被引:0
|
作者
Zhu, Yaoming [1 ]
Sun, Zewei [1 ]
Cheng, Shanbo [1 ]
Huang, Luyang [1 ]
Wu, Liwei [1 ]
Wang, Mingxuan [1 ]
机构
[1] ByteDance, Shenzhen, Peoples R China
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent work has questioned the necessity of visual information in Multimodal Machine Translation (MMT). This paper tries to answer this question and build a new benchmark in this work. As the available dataset is simple and the text input is self-sufficient, we introduce a challenging dataset called EMMT, whose test-set is deliberately designed to ensure ambiguity. More importantly, we study this problem in a real-word scenario towards making the most of multimodal training data. We propose a new framework 2/3-Triplet which can naturally make full use of large-scale image-text and parallel text-only data. Extensive experiments show that visual information is highly crucial in EMMT. The proposed 2/3-Triplet outperforms the strong text-only competitor by 3.8 BLEU score, and even bypasses a commercial translation system. (1)
引用
收藏
页码:2679 / 2697
页数:19
相关论文
共 50 条
  • [21] Multimodal Enhanced Target Representation for Machine Translation
    Zang, Xiaogang
    Zhu, Huidong
    Dai, Xue
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 100 - 108
  • [22] MetaMT, a Meta Learning Method Leveraging Multiple Domain Data for Low Resource Machine Translation
    Li, Rumeng
    Wang, Xun
    Yu, Hong
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8245 - 8252
  • [23] Towards Making the Most of ChatGPT for Machine Translation
    Peng, Keqin
    Ding, Liang
    Zhong, Qihuang
    Shen, Li
    Liu, Xuebo
    Zhang, Min
    Ouyang, Yuanxin
    Tao, Dacheng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5622 - 5633
  • [24] Enhancement of English-Bengali Machine Translation Leveraging Back-Translation
    Mondal, Subrota Kumar
    Wang, Chengwei
    Chen, Yijun
    Cheng, Yuning
    Huang, Yanbo
    Dai, Hong-Ning
    Kabir, H. M. Dipu
    APPLIED SCIENCES-BASEL, 2024, 14 (15):
  • [25] Leveraging Multimodal Haptic Sensory Data for Robust Cutting
    Zhang, Kevin
    Sharma, Mohit
    Veloso, Manuela
    Kroemer, Oliver
    2019 IEEE-RAS 19TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2019, : 409 - 416
  • [26] Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data
    Chowdhury, Koel Dutta
    Hasanuzzaman, Mohammed
    Liu, Qun
    DEEP LEARNING APPROACHES FOR LOW-RESOURCE NATURAL LANGUAGE PROCESSING (DEEPLO), 2018, : 33 - 42
  • [27] Supervised Visual Attention for Simultaneous Multimodal Machine Translation
    Haralampieva, Veneta
    Caglayan, Ozan
    Specia, Lucia
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 74 : 1059 - 1089
  • [28] Multimodal Neural Machine Translation With Weakly Labeled Images
    Heo, Yoonseok
    Kang, Sangwoo
    Yoo, Donghyun
    IEEE ACCESS, 2019, 7 : 54042 - 54053
  • [29] Independent Fusion of Words and Image for Multimodal Machine Translation
    Ma, Junteng
    Qin, Shihao
    Chen, Minping
    Li, Xia
    MACHINE TRANSLATION, CCMT 2019, 2019, 1104 : 35 - 46
  • [30] Improved English to Hindi Multimodal Neural Machine Translation
    Laskar, Sahinur Rahman
    Khilji, Abdullah Faiz Ur Rahman
    Kaushik, Darsh
    Pakray, Partha
    Bandyopadhyay, Sivaji
    WAT 2021: THE 8TH WORKSHOP ON ASIAN TRANSLATION, 2021, : 155 - 160