Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models

被引:0
|
作者
Liang, Xiao [1 ,2 ]
Khaw, Yen-Min Jasmina [1 ]
Liew, Soung-Yue [3 ]
Tan, Tien-Ping [4 ]
Qin, Donghong [2 ]
机构
[1] Univ Tunku Abdul Rahman, Fac Informat & Commun Technol, Dept Comp Sci, Kampar 31900, Malaysia
[2] Guangxi Minzu Univ, Sch Artificial Intelligence, Nanning 530008, Peoples R China
[3] Univ Tunku Abdul Rahman, Fac Informat & Commun Technol, Dept Comp & Commun Technol, Kampar 31900, Malaysia
[4] Univ Sains Malaysia, Sch Comp Sci, George Town 11700, Malaysia
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Machine translation; low-resource languages; large language models; parameter-efficient fine-tuning; LoRA;
D O I
10.1109/ACCESS.2025.3549795
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the field of computational linguistics, addressing machine translation (MT) challenges for low-resource languages remains crucial, as these languages often lack extensive data compared to high-resource languages. General large language models (LLMs), such as GPT-4 and Llama, primarily trained on monolingual corpora, face significant challenges in translating low-resource languages, often resulting in subpar translation quality. This study introduces Language-Specific Fine-Tuning with Low-rank adaptation (LSFTL), a method that enhances translation for low-resource languages by optimizing the multi-head attention and feed-forward networks of Transformer layers through low-rank matrix adaptation. LSFTL preserves the majority of the model parameters while selectively fine-tuning key components, thereby maintaining stability and enhancing translation quality. Experiments on non-English centered low-resource Asian languages demonstrated that LSFTL improved COMET scores by 1-3 points compared to specialized multilingual machine translation models. Additionally, LSFTL's parameter-efficient approach allows smaller models to achieve performance comparable to their larger counterparts, highlighting its significance in making machine translation systems more accessible and effective for low-resource languages.
引用
收藏
页码:46616 / 46626
页数:11
相关论文
共 50 条
  • [41] Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing
    Choenni, Rochelle
    Garrette, Dan
    Shutova, Ekaterina
    COMPUTATIONAL LINGUISTICS, 2023, 49 (03) : 613 - 641
  • [42] Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages
    Dhamecha, Tejas Indulal
    Murthy, Rudra, V
    Bharadwaj, Samarth
    Sankaranarayanan, Karthik
    Bhattacharyya, Pushpak
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8584 - 8595
  • [43] Efficient Adaptation: Enhancing Multilingual Models for Low-Resource Language Translation
    Sel, Ilhami
    Hanbay, Davut
    MATHEMATICS, 2024, 12 (19)
  • [44] DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning
    Daniil, Homskiy
    Narek, Maloyan
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1537 - 1541
  • [45] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
    Zong, Yongshuo
    Bohdal, Ondrej
    Yu, Tingyang
    Yang, Yongxin
    Hospedales, Timothy
    arXiv, 1600,
  • [46] Parameter-efficient fine-tuning in large language models: a survey of methodologies
    Luping Wang
    Sheng Chen
    Linnan Jiang
    Shu Pan
    Runze Cai
    Sen Yang
    Fei Yang
    Artificial Intelligence Review, 58 (8)
  • [47] Prompting or Fine-tuning? A Comparative Study of Large Language Models for Taxonomy Construction
    Chen, Boqi
    Yi, Fandi
    Varro, Daniel
    2023 ACM/IEEE INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS COMPANION, MODELS-C, 2023, : 588 - 596
  • [48] Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages
    Adeyemi, Mofetoluwa
    Oladipo, Akintunde
    Pradeep, Ronak
    Lin, Jimmy
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 650 - 656
  • [49] Enhanced Discriminative Fine-Tuning of Large Language Models for Chinese Text Classification
    Song, Jinwang
    Zan, Hongying
    Zhang, Kunli
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 168 - 174
  • [50] On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages
    Chen, Fuxiang
    Fard, Fatemeh H.
    Lo, David
    Bryksin, Timofey
    30TH IEEE/ACM INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2022), 2022, : 401 - 412