Enhancing Task Performance in Continual Instruction Fine-tuning Through Format Uniformity

被引:0
|
作者
Tan, Xiaoyu [1 ]
Cheng, Leijun [2 ]
Qiu, Xihe [2 ]
Shi, Shaojie [2 ]
Cheng, Yuan [3 ]
Chu, Wei [1 ]
Xu, Yinghui [3 ]
Qi, Yuan [3 ]
机构
[1] INF Technol Shanghai Co Ltd, Shanghai, Peoples R China
[2] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai, Peoples R China
[3] Fudan Univ, AI3 Inst, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Large Language Models; Continual Instruction Fine-tuning; Format Uniformity; Catastrophic Forgetting;
D O I
10.1145/3626772.3657920
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent advancements, large language models (LLMs) have demonstrated remarkable capabilities in diverse tasks, primarily through interactive question-answering with humans. This development marks significant progress towards artificial general intelligence (AGI). Despite their superior performance, LLMs often exhibit limitations when adapted to domain-specific tasks through instruction fine-tuning (IF). The primary challenge lies in the discrepancy between the data distribution in general and domain-specific contexts, leading to suboptimal accuracy in specialized tasks. To address this, continual instruction fine-tuning (CIF), particularly supervised finetuning (SFT), on targeted domain-specific instruction datasets is necessary. Our ablation study reveals that the structure of these instruction datasets critically influences CIF performance, with substantial data distributional shifts resulting in notable performance degradation. In this paper, we introduce a novel framework that enhances CIF by promoting format uniformity. We assess our approach using the Llama2 chat model across various domain-specific instruction datasets. The results demonstrate not only an improvement in task-specific performance under CIF but also a reduction in catastrophic forgetting (CF). This study contributes to the optimization of LLMs for domain-specific applications, highlighting the significance of data structure and distribution in CIF.
引用
收藏
页码:2384 / 2389
页数:6
相关论文
共 50 条
  • [21] Tree Prompting: Efficient Task Adaptation without Fine-Tuning
    Morris, John X.
    Singh, Chandan
    Rush, Alexander M.
    Gao, Jianfeng
    Deng, Yuntian
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 6253 - 6267
  • [22] Enhancing breast ultrasound segmentation through fine-tuning and optimization techniques: Sharp attention UNet
    Khaledyan, Donya
    Marini, Thomas J.
    Baran, Timothy M.
    O'Connell, Avice
    Parker, Kevin
    PLOS ONE, 2023, 18 (12):
  • [23] Enhancing Automatic Speech Recognition With Personalized Models: Improving Accuracy Through Individualized Fine-Tuning
    Brydinskyi, Vitalii
    Sabodashko, Dmytro
    Khoma, Yuriy
    Podpora, Michal
    Konovalov, Alexander
    Khoma, Volodymyr
    IEEE ACCESS, 2024, 12 : 116649 - 116656
  • [24] Replay to Remember: Continual Layer-Specific Fine-Tuning for German Speech Recognition
    Rosin, Theresa Pekarek
    Wermter, Stefan
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VII, 2023, 14260 : 489 - 500
  • [25] Language Models Fine-Tuning for Automatic Format Reconstruction of SEC Financial Filings
    Lombardo, Gianfranco
    Trimigno, Giuseppe
    Pellegrino, Mattia
    Cagnoni, Stefano
    IEEE ACCESS, 2024, 12 : 31249 - 31261
  • [26] Fine-Tuning of Distil-BERT for Continual Learning in Text Classification: An Experimental Analysis
    Shah, Sahar
    Manzoni, Sara Lucia
    Zaman, Farooq
    Es Sabery, Fatima
    Epifania, Francesco
    Zoppis, Italo Francesco
    IEEE ACCESS, 2024, 12 : 104964 - 104982
  • [27] YNU-HPCC at SemEval-2024 Task 7: Instruction Fine-tuning Models for Numerical Understanding and Generation
    Chen, Kaiyuan
    Wang, Jin
    Zhang, Xuejie
    PROCEEDINGS OF THE 18TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2024, 2024, : 973 - 979
  • [28] Arthur Caplan at SemEval-2023 Task 4: Enhancing Human Value Detection through Fine-tuning Pre-trained Models
    Song, Xianxian
    Zhao, Jinhui
    Cao, Ruiqi
    Sui, Linchi
    Guan, Tingyue
    Li, Binyang
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1953 - 1959
  • [29] SpotTune: Transfer Learning through Adaptive Fine-tuning
    Guo, Yunhui
    Shi, Honghui
    Kumar, Abhishek
    Grauman, Kristen
    Rosing, Tajana
    Feris, Rogerio
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4800 - 4809
  • [30] Improve Performance of Fine-tuning Language Models with Prompting
    Yang, Zijian Gyozo
    Ligeti-Nagy, Noenn
    INFOCOMMUNICATIONS JOURNAL, 2023, 15 : 62 - 68