Bridging the Gap between Pre-Training and Fine-Tuning for Commonsense Generation

被引:0
|
作者
Yang, Haoran [1 ]
Wang, Yan [2 ]
Li, Piji [2 ]
Bi, Wei [2 ]
Lam, Wai [1 ]
Xu, Chen [3 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Tencent AI Lab, Shenzhen, Peoples R China
[3] Beijing Univ Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Commonsense generation aims to generate a plausible sentence containing all given unordered concept words. Previous methods focusing on this task usually directly concatenate these words as the input of a pre-trained language model (PLM). However, in PLMs' pre-training process, the inputs are often corrupted sentences with correct word order. This input distribution discrepancy between pre-training and fine-tuning makes the model difficult to fully utilize the knowledge of PLMs. In this paper, we propose a two-stage framework to alleviate this issue. Firstly, in pre-training stage, we design a new format of input to endow PLMs the ability to deal with masked sentences with incorrect word order. Secondly, during fine-tuning, we insert the special token [MASK] between two consecutive concept words to make the input distribution more similar to the input distribution in pre-training. We conduct extensive experiments and provide a thorough analysis to demonstrate the effectiveness of our proposed method. The code is available at https://github.com/LHRYANG/CommonGen.
引用
收藏
页码:376 / 383
页数:8
相关论文
共 50 条
  • [1] Bridging the Gap between Pre-Training and Fine-Tuning for End-to-End Speech Translation
    Wang, Chengyi
    Wu, Yu
    Liu, Shujie
    Yang, Zhenglu
    Zhou, Ming
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9161 - 9168
  • [2] On the Connection between Pre-training Data Diversity and Fine-tuning Robustness
    Ramanujan, Vivek
    Nguyen, Thao
    Oh, Sewoong
    Schmidt, Ludwig
    Farhadi, Ali
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Tri-Train: Automatic Pre-Fine Tuning between Pre-Training and Fine-Tuning for SciNER
    Zeng, Qingkai
    Yu, Wenhao
    Yu, Mengxia
    Jiang, Tianwen
    Weninger, Tim
    Jiang, Meng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4778 - 4787
  • [4] SAR-HUB: Pre-Training, Fine-Tuning, and Explaining
    Yang, Haodong
    Kang, Xinyue
    Liu, Long
    Liu, Yujiang
    Huang, Zhongling
    REMOTE SENSING, 2023, 15 (23)
  • [5] AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
    Li, Ming
    Wu, Jie
    Wang, Xionghui
    Chen, Chen
    Qin, Jie
    Xiao, Xuefeng
    Wang, Rui
    Zheng, Min
    Pan, Xin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6843 - 6853
  • [6] Improved Fine-Tuning by Better Leveraging Pre-Training Data
    Liu, Ziquan
    Xu, Yi
    Xu, Yuanhong
    Qian, Qi
    Li, Hao
    Ji, Xiangyang
    Chan, Antoni B.
    Jin, Rong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [7] FactGen: Faithful Text Generation by Factuality-aware Pre-training and Contrastive Ranking Fine-tuning
    Lan Z.
    Li W.
    Su J.
    Xiao X.
    Liu J.
    Wu W.
    Lyu Y.
    Journal of Artificial Intelligence Research, 2023, 76 : 1281 - 1303
  • [8] FactGen: Faithful Text Generation by Factuality-aware Pre-training and Contrastive Ranking Fine-tuning
    Lan, Zhibin
    Li, Wei
    Su, Jinsong
    Xiao, Xinyan
    Liu, Jiachen
    Wu, Wenhao
    Lyu, Yajuan
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 76 : 1281 - 1303
  • [9] CREATER: CTR-driven Advertising Text Generation with Controlled Pre-Training and Contrastive Fine-Tuning
    Wei, Penghui
    Yang, Xuanhua
    Liu, Shaoguo
    Wang, Liang
    Zheng, Bo
    2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2022, 2022, : 9 - 17
  • [10] Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
    Chen, Tianlong
    Liu, Sijia
    Chang, Shiyu
    Cheng, Yu
    Amini, Lisa
    Wang, Zhangyang
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 696 - 705