Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation

被引:0
|
作者
Yang, Haoran [1 ]
Wang, Yan [2 ]
Li, Piji [2 ]
Bi, Wei [2 ]
Lam, Wai [1 ]
Xu, Chen [3 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Tencent AI Lab, Shenzhen, Peoples R China
[3] Beijing Univ Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commonsense generation aims to generate a plausible sentence containing all given unordered concept words. Previous methods focusing on this task usually directly concatenate these words as the input of a pre-trained language model (PLM). However, in PLMs' pre-training process, the inputs are often corrupted sentences with correct word order. This input distribution discrepancy between pre-training and fine-tuning makes the model difficult to fully utilize the knowledge of PLMs. In this paper, we propose a two-stage framework to alleviate this issue. Firstly, in pre-training stage, we design a new format of input to endow PLMs the ability to deal with masked sentences with incorrect word order. Secondly, during fine-tuning, we insert the special token [MASK] between two consecutive concept words to make the input distribution more similar to the input distribution in pre-training. We conduct extensive experiments and provide a thorough analysis to demonstrate the effectiveness of our proposed method. The code is available at https://github.com/LHRYANG/CommonGen.
引用
收藏
页码:376 / 383
页数:8
相关论文
共 50 条
  • [21] Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction
    Peng Su
    K. Vijay-Shanker
    BMC Bioinformatics, 23
  • [22] Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction
    Su, Peng
    Vijay-Shanker, K.
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [23] Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding
    Cao, Jin
    Wang, Jun
    Hamza, Wael
    Vanee, Kelly
    Li, Shang-Wen
    INTERSPEECH 2020, 2020, : 1570 - 1574
  • [24] Few-Shot Intent Detection via Contrastive Pre-Training and Fine-Tuning
    Zhang, Jian-Guo
    Bui, Trung
    Yoon, Seunghyun
    Chen, Xiang
    Liu, Zhiwei
    Xia, Congying
    Tran, Quan Hung
    Chang, Walter
    Yu, Philip
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1906 - 1912
  • [25] FOOD IMAGE RECOGNITION USING DEEP CONVOLUTIONAL NETWORK WITH PRE-TRAINING AND FINE-TUNING
    Yanai, Keiji
    Kawano, Yoshiyuki
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2015,
  • [26] Robust Face Tracking Using Siamese-VGG with Pre-training and Fine-tuning
    Yuan, Shuo
    Yu, Xinguo
    Majid, Abdul
    2019 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS ENGINEERING (ICCRE), 2019, : 170 - 174
  • [27] Breaking the Barrier Between Pre-training and Fine-tuning: A Hybrid Prompting Model for Knowledge-Based VQA
    Sun, Zhongfan
    Hu, Yongli
    Gao, Qingqing
    Jiang, Huajie
    Gao, Junbin
    Sun, Yanfeng
    Yin, Baocai
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4065 - 4073
  • [28] ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation
    Xiao, Dongling
    Zhang, Han
    Li, Yukun
    Sun, Yu
    Tian, Hao
    Wu, Hua
    Wang, Haifeng
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3997 - 4003
  • [29] Improving Pre-Training and Fine-Tuning for Few-Shot SAR Automatic Target Recognition
    Zhang, Chao
    Dong, Hongbin
    Deng, Baosong
    REMOTE SENSING, 2023, 15 (06)
  • [30] MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning
    Rafailov, Rafael
    Hatch, Kyle
    Kolev, Victor
    Martin, John D.
    Phielipp, Mariano
    Finn, Chelsea
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229