HORNET: Enriching Pre-trained Language Representations with Heterogeneous Knowledge Sources

被引:3
|
作者
Zhang, Taolin [1 ]
Cai, Zerui [1 ]
Wang, Chengyu [2 ]
Li, Peng [2 ]
Li, Yang [2 ]
Qiu, Minghui [2 ]
Tang, Chengguang [2 ]
He, Xiaofeng [1 ]
Huang, Jun [2 ]
机构
[1] East China Normal Univ, Shanghai, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
关键词
Natural Language Processing; Pre-trained Language Model; Knowledge Graph; Heterogeneous Graph Attention Network;
D O I
10.1145/3459637.3482436
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the language understanding abilities of deep language models by leveraging the rich semantic knowledge from knowledge graphs, other than plain pre-training texts. However, previous efforts mostly use homogeneous knowledge (especially structured relation triples in knowledge graphs) to enhance the context-aware representations of entity mentions, whose performance may be limited by the coverage of knowledge graphs. Also, it is unclear whether these KEPLMs truly understand the injected semantic knowledge due to the "blackbox" training mechanism. In this paper, we propose a novel KEPLM named HORNET, which integrates Heterogeneous knOwledge from various structured and unstructured sources into the Roberta NETwork and hence takes full advantage of both linguistic and factual knowledge simultaneously. Specifically, we design a hybrid attention heterogeneous graph convolution network (HaHGCN) to learn heterogeneous knowledge representations based on the structured relation triplets from knowledge graphs and the unstructured entity description texts. Meanwhile, we propose the explicit dual knowledge understanding tasks to help induce a more effective infusion of the heterogeneous knowledge, promoting our model for learning the complicated mappings from the knowledge graph embedding space to the deep context-aware embedding space and vice versa. Experiments show that our HORNET model outperforms various KEPLM baselines on knowledge-aware tasks including knowledge probing, entity typing and relation extraction. Our model also achieves substantial improvement over several GLUE benchmark datasets, compared to other KEPLMs.
引用
收藏
页码:2608 / 2617
页数:10
相关论文
共 50 条
  • [1] Pre-trained Language Model Representations for Language Generation
    Edunov, Sergey
    Baevski, Alexei
    Auli, Michael
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059
  • [2] On the Language Neutrality of Pre-trained Multilingual Representations
    Libovicky, Jindrich
    Rosa, Rudolf
    Fraser, Alexander
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1663 - 1674
  • [3] Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension
    Yang, An
    Wang, Quan
    Liu, Jing
    Liu, Kai
    Lyu, Yajuan
    Wu, Hua
    She, Qiaoqiao
    Li, Sujian
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2346 - 2357
  • [4] Knowledge Inheritance for Pre-trained Language Models
    Qin, Yujia
    Lin, Yankai
    Yi, Jing
    Zhang, Jiajie
    Han, Xu
    Zhang, Zhengyan
    Su, Yusheng
    Liu, Zhiyuan
    Li, Peng
    Sun, Maosong
    Zhou, Jie
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3921 - 3937
  • [5] Enriching Pre-trained Language Model with Entity Information for Relation Classification
    Wu, Shanchan
    He, Yifan
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2361 - 2364
  • [6] Rows from Many Sources: Enriching row completions from Wikidata with a pre-trained Language Model
    Negreanu, Carina
    Karaoglu, Alperen
    Williams, Jack
    Chen, Shuang
    Fabian, Daniel
    Gordon, Andrew
    Lin, Chin-Yew
    [J]. COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 1272 - 1280
  • [7] Probing Pre-Trained Language Models for Disease Knowledge
    Alghanmi, Israa
    Espinosa-Anke, Luis
    Schockaert, Steven
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3023 - 3033
  • [8] Dynamic Knowledge Distillation for Pre-trained Language Models
    Li, Lei
    Lin, Yankai
    Ren, Shuhuai
    Li, Peng
    Zhou, Jie
    Sun, Xu
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 379 - 389
  • [9] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
  • [10] Aspect Based Sentiment Analysis by Pre-trained Language Representations
    Liang Tianxin
    Yang Xiaoping
    Zhou Xibo
    Wang Bingqian
    [J]. 2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 1262 - 1265