Exploring DBpedia and Wikipedia for Portuguese Semantic Relationship Extraction

被引:0
|
作者
Batista, David S. [1 ,2 ]
Forte, David [1 ,2 ]
Silva, Rui [1 ,2 ]
Martins, Bruno [1 ,2 ]
Silva, Mario J. [1 ,2 ]
机构
[1] Inst Super Tecn, Lisbon, Portugal
[2] INESC ID, Lisbon, Portugal
来源
LINGUAMATICA | 2013年 / 5卷 / 01期
关键词
Relation Extraction; Information Extraction;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The identification of semantic relationships, as expressed between named entities in text, is an important step for extracting knowledge from large document collections, such as the Web. Previous works have addressed this task for the English language through supervised learning techniques for automatic classification. The current state of the art involves the use of learning methods based on string kernels (Kim et al., 2010; Zhao e Grishman, 2005). However, such approaches require manually annotated training data for each type of semantic relationship, and have scalability problems when tens or hundreds of different types of relationships have to be extracted. This article discusses an approach for distantly supervised relation extraction over texts written in the Portuguese language, which uses an efficient technique for measuring similarity between relation instances, based on minwise hashing (Broder, 1997) and on locality sensitive hashing (Rajaraman e Ullman, 2011). In the proposed method, the training examples are automatically collected from Wikipedia, corresponding to sentences that express semantic relationships between pairs of entities extracted from DBPedia. These examples are represented as sets of character quadgrams and other representative elements. The sets are indexed in a data structure that implements the idea of locality-sensitive hashing. To check which semantic relationship is expressed between a given pair of entities referenced in a sentence, the most similar training examples are searched, based on an approximation to the Jaccard coefficient, obtained through min-hashing. The relation class is assigned with basis on the weighted votes of the most similar examples. Tests with a dataset from Wikipedia validate the suitability of the proposed method, showing, for instance, that the method is able to extract 10 different types of semantic relations, 8 of them corresponding to asymmetric relations, with an average score of 55.6%, measured in terms of F-1.
引用
收藏
页码:41 / 57
页数:17
相关论文
共 50 条
  • [21] Semantic Wikipedia
    Kroetzsch, Markus
    Vrandecic, Denny
    Voelkel, Max
    Haller, Heiko
    Studer, Rudi
    JOURNAL OF WEB SEMANTICS, 2007, 5 (04): : 251 - 261
  • [22] From DBpedia toWikipedia: Filling the Gap by Discovering Wikipedia Conventions
    Torres, Diego
    Molli, Pascal
    Skaf-Molli, Hala
    Diaz, Alicia
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 535 - 539
  • [23] Named Entity Corpus Construction using Wikipedia and DBpedia Ontology
    Hahm, Younggyun
    Park, Jungyeul
    Lim, Kyungtae
    Kim, Youngsik
    Hwang, Dosam
    Choi, Key-Sun
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2565 - 2569
  • [24] Finding the Semantic Relationship Between Wikipedia Articles Based on a Useful Entry Relationship
    Chen, Lin-Chih
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2017, 13 (04) : 33 - 52
  • [25] Building an Indonesian Named Entity Recognizer using Wikipedia and DBPedia
    Luthfi, Andry
    Distiawan, Bayu
    Manurung, Ruli
    PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2014), 2014, : 19 - 22
  • [26] Automatic semantic relation extraction from Portuguese texts
    Taba, Leonardo Sameshima
    Caseli, Helena de Medeiros
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2739 - 2746
  • [27] Exploring the Relationship between Semantic Spaces and Semantic Relations
    Utsumi, Akira
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,
  • [28] Semantic Services for Wikipedia
    Wang, Haofen
    Penin, Thomas
    Fu, Linyun
    Liu, Qiaoling
    Xue, Guirong
    Yu, Yong
    WEAVING SERVICES AND PEOPLE ON THE WORLD WIDE WEB, 2009, : 27 - 47
  • [29] Semantic Question Answering System Using Dbpedia
    ElKafrawy, Passent M.
    Sauber, Amr M.
    Sabry, Nada A.
    RECENT TRENDS AND FUTURE TECHNOLOGY IN APPLIED INTELLIGENCE, IEA/AIE 2018, 2018, 10868 : 821 - 832
  • [30] Semantic Stability in Wikipedia
    Stanisavljevic, Darko
    Hasani-Mavriqi, Ilire
    Lex, Elisabeth
    Strohmaier, Markus
    Helic, Denis
    COMPLEX NETWORKS & THEIR APPLICATIONS V, 2017, 693 : 385 - 395