Non-Autoregressive Text Generation with Pre-trained Language Models

被引:0
|
作者
Su, Yixuan [1 ]
Cai, Deng [2 ]
Wang, Yan [3 ]
Vandyke, David [4 ]
Baker, Simon [1 ]
Li, Piji [3 ]
Collier, Nigel [1 ]
机构
[1] Univ Cambridge, Language Technol Lab, Cambridge, England
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[3] Tencent AI Lab, Bellevue, WA USA
[4] Apple, Cupertino, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference speed. However, the generation quality of existing NAG models still lags behind their autoregressive counterparts. In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance. Additionally, we devise mechanisms to alleviate the two common problems of vanilla NAG models: the inflexibility of prefixed output length and the conditional independence of individual token predictions. Lastly, to further increase the speed advantage of the proposed model, we propose a new decoding strategy, ratio-first, for applications where the output lengths can be approximately estimated beforehand. For a comprehensive evaluation, we test the proposed model on three text generation tasks, including text summarization, sentence compression and machine translation. Experimental results show that our model significantly outperforms existing non-autoregressive baselines and achieves competitive performance with many strong autoregressive models. In addition, we also conduct extensive analysis experiments to reveal the effect of each proposed component.(1)
引用
收藏
页码:234 / 243
页数:10
相关论文
共 50 条
  • [31] Addressing Extraction and Generation Separately: Keyphrase Prediction With Pre-Trained Language Models
    Liu, Rui
    Lin, Zheng
    Wang, Weiping
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3180 - 3191
  • [32] Adaptive Prompt Routing for Arbitrary Text Style Transfer with Pre-trained Language Models
    Liu, Qingyi
    Qin, Jinghui
    Ye, Wenxuan
    Mou, Hao
    He, Yuxuan
    Wang, Keze
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18689 - 18697
  • [33] BERTifying Sinhala - A Comprehensive Analysis of Pre-trained Language Models for Sinhala Text Classification
    Dhananjaya, Vinura
    Demotte, Piyumal
    Ranathunga, Surangika
    Jayasena, Sanath
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7377 - 7385
  • [34] A Comparison of SVM Against Pre-trained Language Models (PLMs) for Text Classification Tasks
    Wahba, Yasmen
    Madhavji, Nazim
    Steinbacher, John
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2022, PT II, 2023, 13811 : 304 - 313
  • [35] General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference
    Du, Jingfei
    Ott, Myle
    Li, Haoran
    Zhou, Xing
    Stoyanov, Veselin
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [36] An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models
    Agrawal, Sweta
    Carpuat, Marine
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7550 - 7563
  • [37] A Study of Pre-trained Language Models in Natural Language Processing
    Duan, Jiajia
    Zhao, Hui
    Zhou, Qian
    Qiu, Meikang
    Liu, Meiqin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2020), 2020, : 116 - 121
  • [38] On the Power of Pre-Trained Text Representations: Models and Applications in Text Mining
    Meng, Yu
    Huang, Jiaxin
    Zhang, Yu
    Han, Jiawei
    [J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 4052 - 4053
  • [39] From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
    Xu, Weiwen
    Li, Xin
    Zhang, Wenxuan
    Zhou, Meng
    Lam, Wai
    Si, Luo
    Bing, Lidong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [40] Pre-trained models for natural language processing: A survey
    Qiu XiPeng
    Sun TianXiang
    Xu YiGe
    Shao YunFan
    Dai Ning
    Huang XuanJing
    [J]. SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897