JAKET: Joint Pre-training of Knowledge Graph and Language Understanding

被引:0
|
作者
Yu, Donghan [1 ]
Zhu, Chenguang [2 ]
Yang, Yiming [1 ]
Zeng, Michael [2 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Microsoft Cognit Serv Res Grp, Redmond, WA USA
基金
美国国家科学基金会; 美国能源部;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge graphs (KGs) contain rich information about world knowledge, entities, and relations. Thus, they can be great supplements to existing pre-trained language models. However, it remains a challenge to efficiently integrate information from KG into language modeling. And the understanding of a knowledge graph requires related context. We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language. The knowledge module and language module provide essential information to mutually assist each other: the knowledge module produces embeddings for entities in text while the language module generates context-aware initial embeddings for entities and relations in the graph. Our design enables the pre-trained model to easily adapt to unseen knowledge graphs in new domains. Experiment results on several knowledge-aware NLP tasks show that our proposed framework achieves superior performance by effectively leveraging knowledge in language understanding.
引用
收藏
页码:11630 / 11638
页数:9
相关论文
共 50 条
  • [1] Contrastive Language-knowledge Graph Pre-training
    Yuan, Xiaowei
    Liu, Kang
    Wang, Yequan
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (04)
  • [2] Graph Structure Enhanced Pre-Training Language Model for Knowledge Graph Completion
    Zhu, Huashi
    Xu, Dexuan
    Huang, Yu
    Jin, Zhi
    Ding, Weiping
    Tong, Jiahui
    Chong, Guoshuang
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2697 - 2708
  • [3] SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding
    Chung, Yu-An
    Zhu, Chenguang
    Zeng, Michael
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1897 - 1907
  • [4] Pre-training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning
    Chen, Qian
    Wang, Wen
    Zhang, Qinglin
    [J]. INTERSPEECH 2021, 2021, : 1244 - 1248
  • [5] Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training
    Agarwal, Oshin
    Ge, Heming
    Shakeri, Siamak
    Al-Rfou, Rami
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3554 - 3565
  • [6] MPNet: Masked and Permuted Pre-training for Language Understanding
    Song, Kaitao
    Tan, Xu
    Qin, Tao
    Lu, Jianfeng
    Liu, Tie-Yan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [7] Unified Language Model Pre-training for Natural Language Understanding and Generation
    Dong, Li
    Yang, Nan
    Wang, Wenhui
    Wei, Furu
    Liu, Xiaodong
    Wang, Yu
    Gao, Jianfeng
    Zhou, Ming
    Hon, Hsiao-Wuen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] Self-training Improves Pre-training for Natural Language Understanding
    Du, Jingfei
    Grave, Edouard
    Gunel, Beliz
    Chaudhary, Vishrav
    Celebi, Onur
    Auli, Michael
    Stoyanov, Veselin
    Conneau, Alexis
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5408 - 5418
  • [9] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    Devlin, Jacob
    Chang, Ming-Wei
    Lee, Kenton
    Toutanova, Kristina
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4171 - 4186
  • [10] PRE-TRAINING FOR QUERY REWRITING IN A SPOKEN LANGUAGE UNDERSTANDING SYSTEM
    Chen, Zheng
    Fan, Xing
    Ling, Yuan
    Mathias, Lambert
    Guo, Chenlci
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7969 - 7973