Unified Language Model Pre-training for Natural Language Understanding and Generation

被引:0
|
作者
Dong, Li [1 ]
Yang, Nan [1 ]
Wang, Wenhui [1 ]
Wei, Furu [1 ]
Liu, Xiaodong [1 ]
Wang, Yu [1 ]
Gao, Jianfeng [1 ]
Zhou, Ming [1 ]
Hon, Hsiao-Wuen [1 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new UNIfied pre-trained Language Model (UNILM) that can be fine-tuned for both natural language understanding and generation tasks. The model is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The unified modeling is achieved by employing a shared Transformer network and utilizing specific self-attention masks to control what context the prediction conditions on. UNILM compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks. Moreover, UNILM achieves new state-of-the-art results on five natural language generation datasets, including improving the CNN/DailyMail abstractive summarization ROUGE-L to 40.51 (2.04 absolute improvement), the Gigaword abstractive summarization ROUGE-L to 35.75 (0.86 absolute improvement), the CoQA generative question answering F1 score to 82.5 (37.1 absolute improvement), the SQuAD question generation BLEU-4 to 22.12 (3.75 absolute improvement), and the DSTC7 document-grounded dialog response generation NIST-4 to 2.67 (human performance is 2.65). The code and pre-trained models are available at https //github.com/microsoft/unilm.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
    Li, Junnan
    Li, Dongxu
    Xiong, Caiming
    Hoi, Steven
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] Self-training Improves Pre-training for Natural Language Understanding
    Du, Jingfei
    Grave, Edouard
    Gunel, Beliz
    Chaudhary, Vishrav
    Celebi, Onur
    Auli, Michael
    Stoyanov, Veselin
    Conneau, Alexis
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5408 - 5418
  • [3] Unified Pre-training for Program Understanding and Generation
    Ahmad, Wasi Uddin
    Chakraborty, Saikat
    Ray, Baishakhi
    Chang, Kai-Wei
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2655 - 2668
  • [4] Multimodal Pre-training Method for Vision-language Understanding and Generation
    Liu, Tian-Yi
    Wu, Zu-Xuan
    Chen, Jing-Jing
    Jiang, Yu-Gang
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2024 - 2034
  • [5] Cross-Lingual Natural Language Generation via Pre-Training
    Chi, Zewen
    Dong, Li
    Wei, Furu
    Wang, Wenhui
    Mao, Xian-Ling
    Huang, Heyan
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7570 - 7577
  • [6] UNILMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
    Bao, Hangbo
    Dong, Li
    Wei, Furu
    Wang, Wenhui
    Yang, Nan
    Liu, Xiaodong
    Wang, Yu
    Piao, Songhao
    Gao, Jianfeng
    Zhou, Ming
    Hon, Hsiao-Wuen
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [7] MPNet: Masked and Permuted Pre-training for Language Understanding
    Song, Kaitao
    Tan, Xu
    Qin, Tao
    Lu, Jianfeng
    Liu, Tie-Yan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [8] SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding
    Chung, Yu-An
    Zhu, Chenguang
    Zeng, Michael
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1897 - 1907
  • [9] Task-adaptive Pre-training and Self-training are Complementary for Natural Language Understanding
    Li, Shiyang
    Yavuz, Semih
    Chen, Wenhu
    Yan, Xifeng
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1006 - 1015
  • [10] Speech Model Pre-training for End-to-End Spoken Language Understanding
    Lugosch, Loren
    Ravanelli, Mirco
    Ignoto, Patrick
    Tomar, Vikrant Singh
    Bengio, Yoshua
    [J]. INTERSPEECH 2019, 2019, : 814 - 818