Multi-Task Deep Neural Networks for Natural Language Understanding

被引:0
|
作者
Liu, Xiaodong [1 ]
He, Pengcheng [2 ]
Chen, Weizhu [2 ]
Gao, Jianfeng [1 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Microsoft Dynam 365 AI, Redmond, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks. MT-DNN not only leverages large amounts of cross-task data, but also benefits from a regularization effect that leads to more general representations to help adapt to new tasks and domains. MT-DNN extends the model proposed in Liu et al. (2015) by incorporating a pre-trained bidirectional transformer language model, known as BERT (Devlin et al., 2018). MT-DNN obtains new state-of-the-art results on ten NLU tasks, including SNLI, SciTail, and eight out of nine GLUE tasks, pushing the GLUE benchmark to 82.7% (2.2% absolute improvement)(1). We also demonstrate using the SNLI and SciTail datasets that the representations learned by MT-DNN allow domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations. The code and pre-trained models are publicly available at https://github.com/namisan/mt-dnn.
引用
收藏
页码:4487 / 4496
页数:10
相关论文
共 50 条
  • [1] The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
    Liu, Xiaodong
    Wang, Yu
    Ji, Jianshu
    Cheng, Hao
    Zhu, Xueyun
    Awa, Emmanuel
    He, Pengcheng
    Chen, Weizhu
    Poon, Hoifung
    Cao, Guihong
    Gao, Jianfeng
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): SYSTEM DEMONSTRATIONS, 2020, : 118 - 126
  • [2] Empirical evaluation of multi-task learning in deep neural networks for natural language processing
    Jianquan Li
    Xiaokang Liu
    Wenpeng Yin
    Min Yang
    Liqun Ma
    Yaohong Jin
    [J]. Neural Computing and Applications, 2021, 33 : 4417 - 4428
  • [3] Empirical evaluation of multi-task learning in deep neural networks for natural language processing
    Li, Jianquan
    Liu, Xiaokang
    Yin, Wenpeng
    Yang, Min
    Ma, Liqun
    Jin, Yaohong
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (09): : 4417 - 4428
  • [4] BAM! Born-Again Multi-Task Networks for Natural Language Understanding
    Clark, Kevin
    Minh-Thang Luong
    Khandelwal, Urvashi
    Manning, Christopher D.
    Le, Quoc V.
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5931 - 5937
  • [5] Creating CREATE queries with multi-task deep neural networks
    Diker, S. Nazmi
    Sakar, C. Okan
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 266
  • [6] Evolving Deep Parallel Neural Networks for Multi-Task Learning
    Wu, Jie
    Sun, Yanan
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT II, 2022, 13156 : 517 - 531
  • [7] Hierarchical and Bidirectional Joint Multi-Task Classifiers for Natural Language Understanding
    Ji, Xiaoyu
    Hu, Wanyang
    Liang, Yanyan
    [J]. MATHEMATICS, 2023, 11 (24)
  • [8] Bidirectional Transformer Based Multi-Task Learning for Natural Language Understanding
    Tripathi, Suraj
    Singh, Chirag
    Kumar, Abhay
    Pandey, Chandan
    Jain, Nishant
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 54 - 65
  • [9] Deep Neural Language-agnostic Multi-task Text Classifier
    Gawron, Karol
    Pogoda, Michal
    Ropiak, Norbert
    Swedrowski, Michal
    Kocon, Jan
    [J]. 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 136 - 142
  • [10] Multi-Adaptive Optimization for multi-task learning with deep neural networks
    Hervella, alvaro S.
    Rouco, Jose
    Novo, Jorge
    Ortega, Marcos
    [J]. NEURAL NETWORKS, 2024, 170 : 254 - 265