Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation

被引:0
|
作者
Yang, Haoran [1 ]
Wang, Yan [2 ]
Li, Piji [2 ]
Bi, Wei [2 ]
Lam, Wai [1 ]
Xu, Chen [3 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Tencent AI Lab, Shenzhen, Peoples R China
[3] Beijing Univ Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commonsense generation aims to generate a plausible sentence containing all given unordered concept words. Previous methods focusing on this task usually directly concatenate these words as the input of a pre-trained language model (PLM). However, in PLMs' pre-training process, the inputs are often corrupted sentences with correct word order. This input distribution discrepancy between pre-training and fine-tuning makes the model difficult to fully utilize the knowledge of PLMs. In this paper, we propose a two-stage framework to alleviate this issue. Firstly, in pre-training stage, we design a new format of input to endow PLMs the ability to deal with masked sentences with incorrect word order. Secondly, during fine-tuning, we insert the special token [MASK] between two consecutive concept words to make the input distribution more similar to the input distribution in pre-training. We conduct extensive experiments and provide a thorough analysis to demonstrate the effectiveness of our proposed method. The code is available at https://github.com/LHRYANG/CommonGen.
引用
收藏
页码:376 / 383
页数:8
相关论文
共 50 条
  • [41] P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning
    Hu, Xiaomeng
    Yu, Shi
    Xiong, Chenyan
    Liu, Zhenghao
    Liu, Zhiyuan
    Yu, Ge
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1956 - 1962
  • [42] Pre-training using pseudo images and fine-tuning using real images for nighttime traffic Sign Detection
    Yamamoto M.
    Ohashi G.
    IEEJ Transactions on Electronics, Information and Systems, 2021, 141 (09) : 969 - 976
  • [43] Training Deep Spiking Convolutional Neural Networks With STDP-Based Unsupervised Pre-training Followed by Supervised Fine-Tuning
    Lee, Chankyu
    Panda, Priyadarshini
    Srinivasan, Gopalakrishnan
    Roy, Kaushik
    FRONTIERS IN NEUROSCIENCE, 2018, 12
  • [44] Adaptive Pre-Training and Collaborative Fine-Tuning: A Win-Win Strategy to Improve Review Analysis Tasks
    Mao, Qianren
    Li, Jianxin
    Lin, Chenghua
    Chen, Congwen
    Peng, Hao
    Wang, Lihong
    Yu, Philip S.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 622 - 634
  • [45] Pre-training and Fine-tuning Neural Topic Model: A Simple yet Effective Approach to Incorporating External Knowledge
    Zhang, Linhai
    Hu, Xumeng
    Wang, Boyu
    Zhou, Deyu
    Zhang, Qian-Wen
    Cao, Yunbo
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5980 - 5989
  • [46] APF-GAN: Exploring asymmetric pre-training and fine-tuning strategy for conditional generative adversarial network
    Yuxuan Li
    Lingfeng Yang
    Xiang Li
    Computational Visual Media, 2024, 10 : 187 - 192
  • [47] Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization
    Zhang, Haode
    Liang, Haowen
    Zhang, Yuwei
    Zhan, Liming
    Wu, Xiao-Ming
    Lu, Xiaolei
    Lam, Albert Y. S.
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 532 - 542
  • [48] APF-GAN: Exploring asymmetric pre-training and fine-tuning strategy for conditional generative adversarial network
    Li, Yuxuan
    Yang, Lingfeng
    Li, Xiang
    COMPUTATIONAL VISUAL MEDIA, 2024, 10 (01): : 187 - 192
  • [49] Robust Lane Detection Through Self Pre-Training With Masked Sequential Autoencoders and Fine-Tuning With Customized PolyLoss
    Li, Ruohan
    Dong, Yongqi
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 14121 - 14132
  • [50] Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training
    Zhang, Haode
    Liang, Haowen
    Zh, Liming
    Lam, Albert Y. S.
    Wu, Xiao-Ming
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 11105 - 11119