Domain Adaptation for Arabic Machine Translation: Financial Texts as a Case Study

被引:0
|
作者
Alghamdi, Emad A. [1 ,2 ]
Zakraoui, Jezia [2 ]
Abanmy, Fares A. [2 ]
机构
[1] King Abdulaziz Univ, Ctr Excellence AI & Data Sci, Jeddah 21589, Saudi Arabia
[2] ASAS AI Lab, Riyadh 13518, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 16期
关键词
machine translation; Arabic MT; domain adaptation; financial domain;
D O I
10.3390/app14167088
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Neural machine translation (NMT) has shown impressive performance when trained on large-scale corpora. However, generic NMT systems have demonstrated poor performance on out-of-domain translation. To mitigate this issue, several domain adaptation methods have recently been proposed which often lead to better translation quality than genetic NMT systems. While there has been some continuous progress in NMT for English and other European languages, domain adaption in Arabic has received little attention in the literature. The current study, therefore, aims to explore the effectiveness of domain-specific adaptation for Arabic MT (AMT), in yet unexplored domain, financial news articles. To this end, we developed a parallel corpus for Arabic-English (AR-EN) translation in the financial domain to benchmark different domain adaptation methods. We then fine-tuned several pre-trained NMT and Large Language models including ChatGPT-3.5 Turbo on our dataset. The results showed that fine-tuning pre-trained NMT models on a few well-aligned in-domain AR-EN segments led to noticeable improvement. The quality of ChatGPT translation was superior to other models based on automatic and human evaluations. To the best of our knowledge, this is the first work on fine-tuning ChatGPT towards financial domain transfer learning. To contribute to research in domain translation, we made our datasets and fine-tuned models available.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] A simplification-translation-restoration framework for domain adaptation in statistical machine translation: A case study in medical record translation
    Chen, Han-Bin
    Huang, Hen-Hsen
    Hsieh, An-Chang
    Chen, Hsin-Hsi
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 42 : 59 - 80
  • [2] Domain Adaptation for Statistical Machine Translation
    Wang, Xiaoxue
    Zhu, Conghui
    Li, Sheng
    Zhao, Tiejun
    Zheng, Dequan
    [J]. 2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1652 - 1658
  • [3] Efficient Machine Translation Domain Adaptation
    Martins, Pedro Henrique
    Marinhe, Zita
    Martins, Andre F. T.
    [J]. PROCEEDINGS OF THE 1ST WORKSHOP ON SEMIPARAMETRIC METHODS IN NLP: DECOUPLING LOGIC FROM KNOWLEDGE (SPA-NLP 2022), 2022, : 23 - 29
  • [4] Vocabulary Adaptation for Domain Adaptation in Neural Machine Translation
    Sato, Shoetsu
    Sakuma, Jin
    Yoshinaga, Naoki
    Toyoda, Masashi
    Kitsuregawa, Masaru
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4269 - 4279
  • [5] Language Errors in Machine Translation of Encyclopedic Texts from English into Arabic: the case of Google Translate
    Al-Samawi, Ahmad Muhammed
    [J]. ARAB WORLD ENGLISH JOURNAL, 2014, : 182 - 211
  • [6] Translation of Legal Texts between Arabic and English: The Case Study of Marriage Contracts
    Al Aqad, Mohammed H.
    [J]. ARAB WORLD ENGLISH JOURNAL, 2014, 5 (02) : 110 - 121
  • [7] A Domain Adaptation Method for Neural Machine Translation
    Tian, Xiaohu
    Liu, Jin
    Pu, Jiachen
    Wang, Jin
    [J]. ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING, MUE/FUTURETECH 2018, 2019, 518 : 321 - 326
  • [8] Unsupervised Domain Adaptation for Neural Machine Translation
    Yang, Zhen
    Chen, Wei
    Wang, Feng
    Xu, Bo
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 338 - 343
  • [9] Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
    Saunders, Danielle
    [J]. Journal of Artificial Intelligence Research, 2022, 75 : 351 - 424
  • [10] Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
    Saunders, Danielle
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 75 : 351 - 424