Applying SoftTriple Loss for Supervised Language Model Fine Tuning

被引:1
|
作者
Sosnowski, Witold [1 ]
Wroblewska, Anna [1 ]
Gawrysiak, Piotr [2 ]
机构
[1] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland
[2] Warsaw Univ Technol, Fac Elect & Informat Technol, Warsaw, Poland
关键词
SIMILARITY;
D O I
10.15439/2022F185
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a new loss function based on cross entropy and SoftTriple loss, TripleEntropy, to improve classification performance for fine-tuning general knowledge pre-trained language models. This loss function can improve the robust RoBERTa baseline model fine-tuned with cross-entropy loss by about 0.02-2.29 percentage points. Thorough tests on popular datasets using our loss function indicate a steady gain. The fewer samples in the training dataset, the higher gain-thus, for smallsized dataset, it is about 0.71 percentage points, for mediumsized-0.86 percentage points, for large-0.20 percentage points, and for extra-large 0.04 percentage points.
引用
收藏
页码:141 / 147
页数:7
相关论文
共 50 条
  • [21] Fine-tuning natural language imperatives
    Kaufmann, Magdalena
    JOURNAL OF LOGIC AND COMPUTATION, 2019, 29 (03) : 321 - 348
  • [22] On Surgical Fine-tuning for Language Encoders
    Lodha, Abhilasha
    Belapurkar, Gayatri
    Chalkapurkar, Saloni
    Tao, Yuanming
    Ghosh, Reshmi
    Basu, Samyadeep
    Petrov, Dmitrii
    Srinivasan, Soundararajan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3105 - 3113
  • [23] BloomLLM: Large Language Models Based Question Generation Combining Supervised Fine-Tuning and Bloom's Taxonomy
    Nghia Duong-Trung
    Wang, Xia
    Kravcik, Milos
    TECHNOLOGY ENHANCED LEARNING FOR INCLUSIVE AND EQUITABLE QUALITY EDUCATION, PT II, EC-TEL 2024, 2024, 15160 : 93 - 98
  • [24] SFMD: A Semi-supervised Framework for Pre-trained Language Models Fine-Tuning with Noisy Samples
    Yang, Yiwen
    Duan, Pengfei
    Li, Yongbing
    Zhang, Yifang
    Xiong, Shengwu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 316 - 328
  • [25] Efficient fine-tuning of short text classification based on large language model
    Wang, Likun
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON MODELING, NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING, CMNM 2024, 2024, : 33 - 38
  • [26] WalkLM: A Uniform Language Model Fine-tuning Framework for Attributed Graph Embedding
    Tan, Yanchao
    Zhou, Zihao
    Lv, Hang
    Liu, Weiming
    Yang, Carl
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [27] Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
    Xu, Runxin
    Luo, Fuli
    Zhang, Zhiyuan
    Tan, Chuanqi
    Chang, Baobao
    Huang, Songfang
    Huang, Fei
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9514 - 9528
  • [28] Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data
    Kong, Lingkai
    Jiang, Haoming
    Zhuang, Yuchen
    Lyu, Jie
    Zhao, Tuo
    Zhang, Chao
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1326 - 1340
  • [29] Fine-Tuning a Large Language Model with Reinforcement Learning for Educational Question Generation
    Lamsiyah, Salima
    El Mahdaouy, Abdelkader
    Nourbakhsh, Aria
    Schommer, Christoph
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, AIED 2024, 2024, 14829 : 424 - 438
  • [30] AgglutiFiT: Efficient Low-Resource Agglutinative Language Model Fine-Tuning
    Li, Zhe
    Li, Xiuhong
    Sheng, Jiabao
    Slamu, Wushour
    IEEE ACCESS, 2020, 8 : 148489 - 148499