Modeling Common Real-Word Relations Using Triples Extracted from n-Grams

被引:0
|
作者
Sipos, Ruben [1 ]
Mladenic, Dunja [1 ]
Grobelnik, Marko [1 ]
Brank, Janez [1 ]
机构
[1] Jozef Stefan Inst, Ljubljana 1000, Slovenia
来源
SEMANTIC WEB, PROCEEDINGS | 2009年 / 5926卷
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present an approach providing generalized relations for automatic ontology building based on frequent word n-grams. Using publicly available Google n-grams as our data source we can extract relations in form of triples and compute generalized and more abstract models. We propose an algorithm for building abstractions of the extracted triples using WordNet as background knowledge. We also present a novel approach to triple extraction using heuristics, which achieves notably better results than deep parsing applied on n-grams. This allows us to represent information gathered from the web as a set of triples modeling the common and frequent relations expressed in natural language. Our results have potential for usage in different settings including providing for a knowledge base for reasoning or simply as statistical data useful in improving understanding of natural languages.
引用
收藏
页码:16 / 30
页数:15
相关论文
共 34 条
  • [31] A Fused Forensic Text Comparison System Using Lexical Features, Word and Character N-grams A Likelihood Ratio-based Analysis in Predatory Chatlog Messages
    Ishihara, Shunichi
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2762 - 2768
  • [32] Genomic and clinical characterization of metastatic patterns using real-word data from a large cohort of colorectal cancer patients
    Manca, Paolo
    Kris, Ayush V.
    Walch, Henry
    Fong, Christopher
    Jee, Justin
    Pichotta, Karl
    Schultz, Nikolaus
    Chatila, Walid K.
    Yaeger, Rona
    Sanchez-Vega, Francisco
    CANCER RESEARCH, 2024, 84 (06)
  • [33] Real-Word Outcomes Following Left Atrial Appendage Occlusion Using the Watchman™ Device: Analysis From the National Inpatient Sample
    Addoumieh, Antoine
    Khayata, Mohamed
    Tashtish, Nour
    Al-Kindi, Sadeer
    Alkharabsheh, Saqer
    Verma, Beni R.
    Furqan, Muhammad M.
    Klein, Allan L.
    Majdalany, David
    Tarakji, Khaldoun
    Wazni, Oussama M.
    CIRCULATION, 2019, 140
  • [34] Clinical outcomes in pT4N0 colon cancer (CC) patients: Data from a large, multicenter, real-word cohort
    Dapra, V.
    Rossini, D.
    Puccini, A.
    Schietroma, F.
    Cosmai, A.
    Di Francesco, L.
    Zoratto, F.
    Costantini, M.
    Formica, V.
    Rofei, M.
    Mauri, G.
    Bonazzina, E. F.
    Iaia, M.
    Signorelli, C.
    Antonuzzo, L.
    Damonte, C.
    Spinelli, A.
    Tortora, G.
    Santoro, A.
    Salvatore, L.
    ANNALS OF ONCOLOGY, 2024, 35 : S9 - S9