Non-Autoregressive Text Generation with Pre-trained Language Models

被引:0
|
作者
Su, Yixuan [1 ]
Cai, Deng [2 ]
Wang, Yan [3 ]
Vandyke, David [4 ]
Baker, Simon [1 ]
Li, Piji [3 ]
Collier, Nigel [1 ]
机构
[1] Univ Cambridge, Language Technol Lab, Cambridge, England
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[3] Tencent AI Lab, Bellevue, WA USA
[4] Apple, Cupertino, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference speed. However, the generation quality of existing NAG models still lags behind their autoregressive counterparts. In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance. Additionally, we devise mechanisms to alleviate the two common problems of vanilla NAG models: the inflexibility of prefixed output length and the conditional independence of individual token predictions. Lastly, to further increase the speed advantage of the proposed model, we propose a new decoding strategy, ratio-first, for applications where the output lengths can be approximately estimated beforehand. For a comprehensive evaluation, we test the proposed model on three text generation tasks, including text summarization, sentence compression and machine translation. Experimental results show that our model significantly outperforms existing non-autoregressive baselines and achieves competitive performance with many strong autoregressive models. In addition, we also conduct extensive analysis experiments to reveal the effect of each proposed component.(1)
引用
收藏
页码:234 / 243
页数:10
相关论文
共 50 条
  • [1] Non-Autoregressive ASR Modeling Using Pre-Trained Language Models for Chinese Speech Recognition
    Yu, Fu-Hao
    Chen, Kuan-Yu
    Lu, Ke-Han
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1474 - 1482
  • [2] Pre-Trained Language Models for Text Generation: A Survey
    Li, Junyi
    Tang, Tianyi
    Zhao, Wayne Xin
    Nie, Jian-Yun
    Wen, Ji-Rong
    [J]. ACM COMPUTING SURVEYS, 2024, 56 (09)
  • [3] IMPROVING NON-AUTOREGRESSIVE END-TO-END SPEECH RECOGNITION WITH PRE-TRAINED ACOUSTIC AND LANGUAGE MODELS
    Deng, Keqi
    Yang, Zehui
    Watanabe, Shinji
    Higuchi, Yosuke
    Cheng, Gaofeng
    Zhang, Pengyuan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8522 - 8526
  • [4] Diffusion Models for Non-autoregressive Text Generation: A Survey
    Li, Yifan
    Zhou, Kun
    Zhao, Wayne Xin
    Wen, Ji-Rong
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6692 - 6701
  • [5] Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
    Yu, Dian
    Yu, Zhou
    Sagae, Kenji
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2251 - 2268
  • [6] Leveraging pre-trained language models for code generation
    Soliman, Ahmed
    Shaheen, Samir
    Hadhoud, Mayada
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
  • [7] A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models
    Zhang, Hanqing
    Song, Haolin
    Li, Shaoyu
    Zhou, Ming
    Song, Dawei
    [J]. ACM COMPUTING SURVEYS, 2024, 56 (03)
  • [8] Leveraging Pre-Trained Language Model for Summary Generation on Short Text
    Zhao, Shuai
    You, Fucheng
    Liu, Zeng Yuan
    [J]. IEEE ACCESS, 2020, 8 : 228798 - 228803
  • [9] Exploring Pre-trained Language Models for Event Extraction and Generation
    Yang, Sen
    Feng, Dawei
    Qiao, Linbo
    Kan, Zhigang
    Li, Dongsheng
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5284 - 5294
  • [10] Automatic Title Generation for Text with Pre-trained Transformer Language Model
    Mishra, Prakhar
    Diwan, Chaitali
    Srinivasa, Srinath
    Srinivasaraghavan, G.
    [J]. 2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 17 - 24