Semantic Annotation of Documents Based on Wikipedia Concepts

被引:0
|
作者
Brank, Janez [1 ]
Leban, Gregor [1 ]
Grobelnik, Marko [1 ]
机构
[1] Jozef Stefan Inst, Jamova 39, Ljubljana, Slovenia
来源
基金
欧盟地平线“2020”;
关键词
semantic annotation; wikification; disambiguation; text mining;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Semantic annotation is the task of augmenting an unstructured textual document with semantic information, such as concepts from an ontology. In wikification, the Wikipedia is used as an ontology and its pages (articles) are regarded as (representations of) concepts. We describe an efficient approach for annotating a document with relevant concepts from the Wikipedia. A global disambiguation method based on constructing a mention-concept graph and computing pagerank over it is used to identify a coherent set of relevant concepts considering the input document as a whole. The presented approach is suitable for parallel processing and can support any language for which a sufficiently large Wikipedia is available. Several heuristics involved in the disambiguation of candidate annotations are discussed and an experimental evaluation of their influence is presented.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [1] Semantic annotation of documents based on wikipedia concepts
    Brank, Janez
    Leban, Gregor
    Grobelnik, Marko
    [J]. Informatica (Slovenia), 2018, 42 (01): : 23 - 32
  • [2] Semantic Annotation of Unstructured Documents Using Concepts Similarity
    Pech, Fernando
    Martinez, Alicia
    Estrada, Hugo
    Hernandez, Yasmin
    [J]. SCIENTIFIC PROGRAMMING, 2017, 2017
  • [3] A WIKIPEDIA-BASED FRAMEWORK FOR COLLABORATIVE SEMANTIC ANNOTATION
    Fernandez, N.
    Fisteus, J. A.
    Fuentes, D.
    Sanchez, L.
    Luque, V.
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2011, 20 (05) : 847 - 886
  • [4] Information extraction and semantic annotation of wikipedia
    Computer Science Department, Universidad Autonoma de Madrid, Spain
    不详
    [J]. Front. Artif. Intell. Appl., 2008, 1 (145-169):
  • [5] An Ontology-Driven Approach for Semantic Annotation of Documents with Specific Concepts
    Alec, Celine
    Reynaud-Delaitre, Chantal
    Safar, Brigitte
    [J]. SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, 2016, 9678 : 609 - 624
  • [6] Semantic Annotation in Historical Documents
    Pereira, Juliana Wolf
    Barros Goncalves, Marcelo Rocha
    Prado Santos, Marilde Terezinha
    [J]. 2017 12TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2017,
  • [7] An annotation tool for semantic documents
    Eriksson, Henrik
    [J]. SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, 2007, 4519 : 759 - 768
  • [8] Heuristics based semantic annotation of biodiversity documents in Chinese
    Yufeng DUAN
    Zhenzhen HEI
    Fei JU
    Hong CUI
    [J]. Journal of Data and Information Science, 2013, 6 (02) : 33 - 46
  • [9] Improved semantic annotation method for documents based on ontology
    Chen, Yewang
    Li, Wen
    Peng, Xin
    Zhao, Wenyun
    [J]. Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2009, 39 (06): : 1109 - 1113
  • [10] Ontology based semantic annotation of Urdu language web documents
    Rajput, Quratulain
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 18TH ANNUAL CONFERENCE, KES-2014, 2014, 35 : 662 - 670