Learning Knowledge-Enhanced Contextual Language Representations for Domain Natural Language Understanding

被引:0
|
作者
Zhang, Taolin [1 ,2 ]
Xu, Ruyao [1 ]
Wang, Chengyu [2 ]
Duan, Zhongjie [1 ]
Chen, Cen [1 ]
Qiu, Minghui [2 ]
Cheng, Dawei [3 ]
He, Xiaofeng [1 ]
Qian, Weining [1 ]
机构
[1] East China Normal Univ, Shanghai, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Tongji Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the performance of various downstream NLP tasks by injecting knowledge facts from large-scale Knowledge Graphs (KGs). However, existing methods for pre-training KEPLMs with relational triples are difficult to be adapted to close domains due to the lack of sufficient domain graph semantics. In this paper, we propose a Knowledgeenhanced lANGuAge Representation learning framework for various clOsed dOmains (KAN-GAROO) via capturing the implicit graph structure among the entities. Specifically, since the entity coverage rates of closed-domain KGs can be relatively low and may exhibit the global sparsity phenomenon for knowledge injection, we consider not only the shallow relational representations of triples but also the hyperbolic embeddings of deep hierarchical entityclass structures for effective knowledge fusion. Moreover, as two closed-domain entities under the same entity-class often have locally dense neighbor subgraphs counted by max point bi-connected component, we further propose a data augmentation strategy based on contrastive learning over subgraphs to construct hard negative samples of higher quality. It makes the underlying KELPMs better distinguish the semantics of these neighboring entities to further complement the global semantic sparsity. In the experiments, we evaluate KANGAROO over various knowledge-aware and general NLP tasks in both full and few-shot learning settings, outperforming various KEPLM training paradigms performance in closed-domains significantly.
引用
收藏
页码:15663 / 15676
页数:14
相关论文
共 50 条
  • [1] DKPLM: Decomposable Knowledge-Enhanced Pre-trained Language Model for Natural Language Understanding
    Zhang, Taolin
    Wang, Chengyu
    Hu, Nan
    Qiu, Minghui
    Tang, Chengguang
    He, Xiaofeng
    Huang, Jun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11703 - 11711
  • [2] SPOT: Knowledge-Enhanced Language Representations for Information Extraction
    Li, Jiacheng
    Katsis, Yannis
    Baldwin, Tyler
    Kim, Ho-Cheol
    Bartko, Andrew
    McAuley, Julian
    Hsu, Chun-Nan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1124 - 1134
  • [3] KEBLM: Knowledge-Enhanced Biomedical Language Models
    Lai, Tuan Manh
    Zhai, ChengXiang
    Ji, Heng
    JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 143
  • [4] Leveraging BERT for Natural Language Understanding of Domain-Specific Knowledge
    Iga, Vasile Ionut
    Silaghi, Gheorghe Cosmin
    2023 25TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, SYNASC 2023, 2023, : 210 - 215
  • [5] Metadata Shaping: A Simple Approach for Knowledge-Enhanced Language Models
    Arora, Simran
    Wu, Sen
    Liu, Enci
    Re, Christopher
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1733 - 1745
  • [6] Knowledge-Enhanced Visual-Language Pretraining for Computational Pathology
    Zhou, Xiao
    Zhang, Xiaoman
    Wu, Chaoyi
    Zhang, Ya
    Xie, Weidi
    Wang, Yanfeng
    COMPUTER VISION - ECCV 2024, PT LII, 2025, 15110 : 345 - 362
  • [7] Combining large language models with enterprise knowledge graphs: a perspective on enhanced natural language understanding
    Mariotti, Luca
    Guidetti, Veronica
    Mandreoli, Federica
    Belli, Andrea
    Lombardi, Paolo
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [8] KNOWLEDGE REPRESENTATION FOR NATURAL LANGUAGE UNDERSTANDING
    Stanojevic, Mladen
    Vranes, Sanja
    FACTA UNIVERSITATIS-SERIES MATHEMATICS AND INFORMATICS, 2006, 21 : 93 - 104
  • [9] Learning to Map Natural Language Statements into Knowledge Base Representations for Knowledge Base Construction
    Lin, Chin-Ho
    Huang, Hen-Hsen
    Chen, Hsin-Hsi
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3433 - 3437
  • [10] Construction of Legal Knowledge Graph Based on Knowledge-Enhanced Large Language Models
    Li, Jun
    Qian, Lu
    Liu, Peifeng
    Liu, Taoxiong
    INFORMATION, 2024, 15 (11)