The Impact of Word Segmentation Techniques on Neural and Statistical Machine Translation: English-Arabic Case

被引:0
|
作者
Berrichi, Safae [1 ]
Mazroui, Azzeddine [1 ]
机构
[1] Mohammed First Univ, Fac Sci, Dept Comp Sci, Oujda, Morocco
关键词
Machine translation; Morphological segmentation; Sub-word segmentation; Statistical approach; Neural approach;
D O I
10.1007/978-3-030-90633-7_38
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper deals with Machine Translation between the English and Arabic languages. This task is very tricky given the morphological richness of the Arabic language and the unavailability of large parallel corpora. To overcome those issues, we have examined the impact of word segmentation (sub-word and morphological segmentation) on machine translation performance. We have tested both the statistical approach and the neural approach which is widely employed in recent years owing to its promising results. In our experiments, carried out on English-Arabic direction and based on the United Nations parallel corpus, we show that applying morphological segmentation to the target language proved very beneficial, whereas sub-word segmentation made no significant impact on both neural and statistical models.
引用
收藏
页码:454 / 462
页数:9
相关论文
共 50 条
  • [21] Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
    Wang, Xing
    Tu, Zhaopeng
    Zhang, Min
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2255 - 2266
  • [22] Morphology-Aware Word-Segmentation in Dialectal Arabic Adaptation of Neural Machine Translation
    Tawfik, Ahmed Y.
    Emam, Mahitab
    Essam, Khaled
    Nabil, Robert
    Hassan, Hany
    FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), 2019, : 11 - 17
  • [23] Arabic-Segmentation Combination Strategies for Statistical Machine Translation
    Mansour, Saab
    Ney, Hermann
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3915 - 3920
  • [24] An automatic English-Arabic HTML']HTML page translation system
    Zantout, RN
    Guessoum, AA
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2001, 24 (04) : 333 - 357
  • [25] A Hybrid Rules and Statistical Method for Arabic to English Machine Translation
    Alqudsi, Arwa
    Omar, Nazlia
    Shaker, Khalid
    2019 2ND INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS & INFORMATION SECURITY (ICCAIS), 2019,
  • [26] English-Arabic Text Translation and Abstractive Summarization Using Transformers
    Holiel, Heidi Ahmed
    Mohamed, Nancy
    Ahmed, Arwa
    Medhat, Walaa
    2023 20TH ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, AICCSA, 2023,
  • [27] Linguistic Constraints on Statistical Word Segmentation: The Role of Consonants in Arabic and English
    Kastner, Itamar
    Adriaans, Frans
    COGNITIVE SCIENCE, 2018, 42 : 494 - 518
  • [28] Gender Aware Spoken Language Translation Applied to English-Arabic
    Elaraby, Mostafa
    Tawfik, Ahmed Y.
    Khaled, Mahmoud
    Hassan, Hany
    Osama, Aly
    2018 2ND INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING (ICNLSP), 2018, : 119 - 124
  • [29] Statistical Error Analysis of Machine Translation: The Case of Arabic
    El Marouani, Mohamed
    Boudaa, Tarik
    Enneya, Nourddine
    COMPUTACION Y SISTEMAS, 2020, 24 (03): : 1053 - 1061
  • [30] Improved Arabic-to-English statistical machine translation by reordering post-verbal subjects for word alignment
    Carpuat, Marine
    Marton, Yuval
    Habash, Nizar
    MACHINE TRANSLATION, 2012, 26 (1-2) : 105 - 120