A Comparative Study of Different Dimensionality Reduction Techniques for Arabic Machine Translation

被引:1
|
作者
Bensalah, Nouhaila [1 ]
Ayad, Habib [1 ]
Adib, Abdellah [1 ]
El Farouk, Abdelhamid Ibn [2 ]
机构
[1] Univ Hassan 2, Data Sci & Artificial Intelligence, Casablanca 20000, Morocco
[2] Languages & Cultures Lab, Mohammadia, Morocco
关键词
Dimensionality Reduction Techniques; post-processing algorithm; Arabic machine translation; Transformer;
D O I
10.1145/3634681
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word embeddings are widely deployed in a tremendous range of fundamental natural language processing applications and are also useful for generating representations of paragraphs, sentences, and documents. In some contexts involving constrained memory, it may be beneficial to reduce the size of word embeddings since they represent a core component of several natural language processing tasks. By reducing the dimensionality of word embeddings, their usefulness in memory-limited devices can be significantly improved, yielding gains in many real-world applications. This article aims to provide a comparative study of different dimensionality reduction techniques to generate efficient lower-dimensional word vectors. Based on empirical experiments carried out on the Arabic machine translation task, we found that the post-processing algorithm combined with independent component analysis provides optimal performance over the considered dimensionality reduction techniques. Therefore, we arrive at a new combination of the post-processing algorithm and dimensionality reduction (independent component analysis) techniques, which has not been investigated before. The latter was applied to both contextual and non-contextual word embeddings to reduce the size of the vectors while achieving a better translation quality than the original ones.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] An empirical study of dimensionality reduction in support vector machine
    Cao, L. J.
    Zhang, JingQing
    Cai, Zongwu
    Lim, Kian Guan
    [J]. NEURAL NETWORK WORLD, 2006, 16 (03) : 177 - 192
  • [22] A Comparative Assessment of Dimensionality Reduction Techniques for Diagnosing Faults in Smart Grids
    Hassani, Hossein
    Razavi-Far, Roozbeh
    Saif, Mehrdad
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3618 - 3623
  • [23] Comparative analysis of nonlinear dimensionality reduction techniques for breast MRI segmentation
    Akhbardeh, Alireza
    Jacobs, Michael A.
    [J]. MEDICAL PHYSICS, 2012, 39 (04) : 2275 - 2289
  • [24] Comparing Dimensionality Reduction Techniques
    Nick, William
    Shelton, Joseph
    Bullock, Gina
    Esterline, Albert
    Asamene, Kassahun
    [J]. IEEE SOUTHEASTCON 2015, 2015,
  • [25] A Review on Dimensionality Reduction Techniques
    Huang, Xuan
    Wu, Lei
    Ye, Yinsong
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2019, 33 (10)
  • [26] A comparative study of different chattering reduction techniques for sliding mode controllers
    Aquino-Juarez, Blanca Rosario
    Navarro-Martinez, Fernando Abel
    Barahona-Avalos, Jorge Luis
    Ramirez-Leyva, Fermin Hugo
    [J]. 2023 XXV ROBOTICS MEXICAN CONGRESS, COMROB, 2023, : 56 - 61
  • [27] On Video Textures Generation: A Comparison Between Different Dimensionality Reduction Techniques
    Fan, Wentao
    Bouguila, Nizar
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 5134 - 5139
  • [28] Challenges in Machine Translation into Arabic Language
    Khan, Lubna Farah
    [J]. IJAZ ARABI JOURNAL OF ARABIC LEARNING, 2020, 3 (02):
  • [29] Evaluating Arabic to English Machine Translation
    Hadla, Laith S.
    Hailat, Taghreed M.
    Al-Kabi, Mohammed N.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (11) : 68 - 73
  • [30] Machine translation between Hebrew and Arabic
    Shilon, Reshef
    Habash, Nizar
    Lavie, Alon
    Wintner, Shuly
    [J]. MACHINE TRANSLATION, 2012, 26 (1-2) : 177 - 195