Estimation-based optimizations for the semantic compression of RDF knowledge bases

被引:1
|
作者
Wang, Ruoyu [1 ]
Wong, Raymond [1 ]
Sun, Daniel [2 ]
机构
[1] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
[2] UGAiForge LLC, Canberra, ACT, Australia
关键词
Knowledge bases; Semantic compression; Negative sampling; Statistical estimation; Optimization; Rule mining; SOCIAL QUESTION; GRAPH; RULES;
D O I
10.1016/j.ipm.2024.103799
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Structured knowledge bases are critical for the interpretability of AI techniques. RDF KBs, which are the dominant representation of structured knowledge, are expanding extremely fast to increase their knowledge coverage, enhancing the capability of knowledge reasoning while bringing heavy burdens to downstream applications. Recent studies employ semantic compression to detect and remove knowledge redundancies via semantic models and use the induced model for further applications, such as knowledge completion and error detection. However, semantic models that are sufficiently expressive for semantic compression cannot be efficiently induced, especially for large-scale KBs, due to the hardness of logic induction. In this article, we present estimation-based optimizations for the semantic compression of RDF KBs from the perspectives of input and intermediate data involved in the induction of first-order logic rules. The negative sampling technique selects a representative subset of all negative tuples with respect to the closed-world assumption, reducing the cost of evaluating the quality of a logic rule used for knowledge inference. The number of logic inference operations used during a compression procedure is reduced by a statistical estimation technique that prunes logic rules of low quality. The evaluation results show that the two techniques are feasible for the purpose of semantic compression and accelerate the compression algorithm by up to 47x compared to the state-of-the-art system.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Toward Veracity Assessment in RDF Knowledge Bases: An Exploratory Analysis
    Esteves, Diego
    Rula, Anisa
    Reddy, Aniketh Janardhan
    Lehmann, Jens
    ACM JOURNAL OF DATA AND INFORMATION QUALITY, 2018, 9 (03):
  • [22] Semantic Search on Text and Knowledge Bases
    Bast, Hannah
    Buchhold, Bjoern
    Haussmann, Elmar
    FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2016, 10 (2-3): : 120 - +
  • [23] Semantic computing and language knowledge bases
    Wang, Lei
    Wang, Houfeng
    Yu, Shiwen
    2017 3RD INTERNATIONAL CONFERENCE ON APPLIED MATERIALS AND MANUFACTURING TECHNOLOGY (ICAMMT 2017), 2017, 242
  • [24] Fuzzy Clustering for Semantic Knowledge Bases
    Esposito, Floriana
    d'Amato, Claudia
    Fanizzi, Nicola
    FUNDAMENTA INFORMATICAE, 2010, 99 (02) : 187 - 205
  • [25] Semantic SPARQL Similarity Search Over RDF Knowledge Graphs
    Zheng, Weiguo
    Zou, Lei
    Peng, Wei
    Yan, Xifeng
    Song, Shaoxu
    Zhao, Dongyan
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (11): : 840 - 851
  • [26] RDF approximate queries based on semantic similarity
    Yan, Li
    Ma, Ruizhe
    Li, Dazhen
    Cheng, Jingwei
    COMPUTING, 2017, 99 (05) : 481 - 491
  • [27] RDF approximate queries based on semantic similarity
    Li Yan
    Ruizhe Ma
    Dazhen Li
    Jingwei Cheng
    Computing, 2017, 99 : 481 - 491
  • [28] WDAqua-corel: A Question Answering service for RDF Knowledge Bases
    Diefenbach, Dennis
    Singh, Kamal
    Maret, Pierre
    COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 1087 - 1091
  • [29] KawaWiki: A semantic Wiki based on RDF templates
    Kawamoto, Kensaku
    Kitamura, Yasuhiko
    Tijerino, Yuri
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS PROCEEDINGS, 2006, : 425 - +
  • [30] A hierarchical clustering method for semantic knowledge bases
    Fanizzi, Nicola
    d'Amato, Claudia
    KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS: KES 2007 - WIRN 2007, PT III, PROCEEDINGS, 2007, 4694 : 653 - +