Non-Autoregressive Text Generation with Pre-trained Language Models

被引：0

作者：

Su, Yixuan ^{[1
]}

Cai, Deng ^{[2
]}

Wang, Yan ^{[3
]}

Vandyke, David ^{[4
]}

Baker, Simon ^{[1
]}

Li, Piji ^{[3
]}

Collier, Nigel ^{[1
]}

机构：

[1] Univ Cambridge, Language Technol Lab, Cambridge, England

[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[3] Tencent AI Lab, Bellevue, WA USA

[4] Apple, Cupertino, CA USA

来源：

16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference speed. However, the generation quality of existing NAG models still lags behind their autoregressive counterparts. In this work, we show that BERT can be employed as the backbone of a NAG model to greatly improve performance. Additionally, we devise mechanisms to alleviate the two common problems of vanilla NAG models: the inflexibility of prefixed output length and the conditional independence of individual token predictions. Lastly, to further increase the speed advantage of the proposed model, we propose a new decoding strategy, ratio-first, for applications where the output lengths can be approximately estimated beforehand. For a comprehensive evaluation, we test the proposed model on three text generation tasks, including text summarization, sentence compression and machine translation. Experimental results show that our model significantly outperforms existing non-autoregressive baselines and achieves competitive performance with many strong autoregressive models. In addition, we also conduct extensive analysis experiments to reveal the effect of each proposed component.(1)

引用

页码：234 / 243

页数：10

共 50 条

[1] Non-Autoregressive ASR Modeling Using Pre-Trained Language Models for Chinese Speech Recognition
Yu, Fu-Hao
Chen, Kuan-Yu
Lu, Ke-Han
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1474 - 1482
[2] Pre-Trained Language Models for Text Generation: A Survey
Li, Junyi
Tang, Tianyi
Zhao, Wayne Xin
Nie, Jian-Yun
Wen, Ji-Rong
[J]. ACM COMPUTING SURVEYS, 2024, 56 (09)
[3] IMPROVING NON-AUTOREGRESSIVE END-TO-END SPEECH RECOGNITION WITH PRE-TRAINED ACOUSTIC AND LANGUAGE MODELS
Deng, Keqi
Yang, Zehui
Watanabe, Shinji
Higuchi, Yosuke
Cheng, Gaofeng
Zhang, Pengyuan
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8522 - 8526
[4] Diffusion Models for Non-autoregressive Text Generation: A Survey
Li, Yifan
Zhou, Kun
Zhao, Wayne Xin
Wen, Ji-Rong
[J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6692 - 6701
[5] Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
Yu, Dian
Yu, Zhou
Sagae, Kenji
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2251 - 2268
[6] Leveraging pre-trained language models for code generation
Soliman, Ahmed
Shaheen, Samir
Hadhoud, Mayada
[J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
[7] A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models
Zhang, Hanqing
Song, Haolin
Li, Shaoyu
Zhou, Ming
Song, Dawei
[J]. ACM COMPUTING SURVEYS, 2024, 56 (03)
[8] Leveraging Pre-Trained Language Model for Summary Generation on Short Text
Zhao, Shuai
You, Fucheng
Liu, Zeng Yuan
[J]. IEEE ACCESS, 2020, 8 : 228798 - 228803
[9] Exploring Pre-trained Language Models for Event Extraction and Generation
Yang, Sen
Feng, Dawei
Qiao, Linbo
Kan, Zhigang
Li, Dongsheng
[J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5284 - 5294
[10] Automatic Title Generation for Text with Pre-trained Transformer Language Model
Mishra, Prakhar
Diwan, Chaitali
Srinivasa, Srinath
Srinivasaraghavan, G.
[J]. 2021 IEEE 15TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2021), 2021, : 17 - 24

← 1 2 3 4 5 →