Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

被引：0

作者：

Zhu, Yaoming ^{[1
]}

Sun, Zewei ^{[1
]}

Cheng, Shanbo ^{[1
]}

Huang, Luyang ^{[1
]}

Wu, Liwei ^{[1
]}

Wang, Mingxuan ^{[1
]}

机构：

[1] ByteDance, Shenzhen, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent work has questioned the necessity of visual information in Multimodal Machine Translation (MMT). This paper tries to answer this question and build a new benchmark in this work. As the available dataset is simple and the text input is self-sufficient, we introduce a challenging dataset called EMMT, whose test-set is deliberately designed to ensure ambiguity. More importantly, we study this problem in a real-word scenario towards making the most of multimodal training data. We propose a new framework 2/3-Triplet which can naturally make full use of large-scale image-text and parallel text-only data. Extensive experiments show that visual information is highly crucial in EMMT. The proposed 2/3-Triplet outperforms the strong text-only competitor by 3.8 BLEU score, and even bypasses a commercial translation system. (1)

引用

页码：2679 / 2697

页数：19

共 50 条

[21] Multimodal Enhanced Target Representation for Machine Translation
Zang, Xiaogang
Zhu, Huidong
Dai, Xue
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND NETWORKS, VOL II, CENET 2023, 2024, 1126 : 100 - 108
[22] MetaMT, a Meta Learning Method Leveraging Multiple Domain Data for Low Resource Machine Translation
Li, Rumeng
Wang, Xun
Yu, Hong
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8245 - 8252
[23] Towards Making the Most of ChatGPT for Machine Translation
Peng, Keqin
Ding, Liang
Zhong, Qihuang
Shen, Li
Liu, Xuebo
Zhang, Min
Ouyang, Yuanxin
Tao, Dacheng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 5622 - 5633
[24] Enhancement of English-Bengali Machine Translation Leveraging Back-Translation
Mondal, Subrota Kumar
Wang, Chengwei
Chen, Yijun
Cheng, Yuning
Huang, Yanbo
Dai, Hong-Ning
Kabir, H. M. Dipu
APPLIED SCIENCES-BASEL, 2024, 14 (15):
[25] Leveraging Multimodal Haptic Sensory Data for Robust Cutting
Zhang, Kevin
Sharma, Mohit
Veloso, Manuela
Kroemer, Oliver
2019 IEEE-RAS 19TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2019, : 409 - 416
[26] Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data
Chowdhury, Koel Dutta
Hasanuzzaman, Mohammed
Liu, Qun
DEEP LEARNING APPROACHES FOR LOW-RESOURCE NATURAL LANGUAGE PROCESSING (DEEPLO), 2018, : 33 - 42
[27] Supervised Visual Attention for Simultaneous Multimodal Machine Translation
Haralampieva, Veneta
Caglayan, Ozan
Specia, Lucia
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 74 : 1059 - 1089
[28] Multimodal Neural Machine Translation With Weakly Labeled Images
Heo, Yoonseok
Kang, Sangwoo
Yoo, Donghyun
IEEE ACCESS, 2019, 7 : 54042 - 54053
[29] Independent Fusion of Words and Image for Multimodal Machine Translation
Ma, Junteng
Qin, Shihao
Chen, Minping
Li, Xia
MACHINE TRANSLATION, CCMT 2019, 2019, 1104 : 35 - 46
[30] Improved English to Hindi Multimodal Neural Machine Translation
Laskar, Sahinur Rahman
Khilji, Abdullah Faiz Ur Rahman
Kaushik, Darsh
Pakray, Partha
Bandyopadhyay, Sivaji
WAT 2021: THE 8TH WORKSHOP ON ASIAN TRANSLATION, 2021, : 155 - 160

← 1 2 3 4 5 →