Named Entity Corpus Construction using Wikipedia and DBpedia Ontology

被引:0
|
作者
Hahm, Younggyun [1 ]
Park, Jungyeul [2 ]
Lim, Kyungtae [3 ]
Kim, Youngsik [3 ]
Hwang, Dosam [4 ]
Choi, Key-Sun [1 ,3 ]
机构
[1] Korea Adv Inst Sci & Technol, Div Web Sci & Technol, Taejon, South Korea
[2] Univ Rennes 1, IRISA, UMR 6074, Lannion, France
[3] Korea Adv Inst Sci & Technol, Dept Comp Sci, Taejon, South Korea
[4] Yeungnam Univ, Dept Comp Sci, Gyongsan, Gyeongsangbuk D, South Korea
关键词
Corpus; Named Entity Recognition; Linked Data;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
In this paper, we propose a novel method to automatically build a named entity corpus based on the DBpedia ontology. Since most of named entity recognition systems require time and effort consuming annotation tasks as training data. Work on NER has thus for been limited on certain languages like English that are resource-abundant in general. As an alternative, we suggest that the NE corpus generated by our proposed method, can be used as training data. Our approach introduces Wikipedia as a raw text and uses the DBpedia data set for named entity disambiguation. Our method is language-independent and easy to be applied to many different languages where Wikipedia and DBpedia are provided. Throughout the paper, we demonstrate that our NE corpus is of comparable quality even to the manually annotated NE corpus.
引用
收藏
页码:2565 / 2569
页数:5
相关论文
共 50 条
  • [41] Research on Chinese Named Entity Recognition Based on Ontology
    Chang, Weili
    Luo, Fang
    Qian, Jilai
    MECHANICAL ENGINEERING AND INTELLIGENT SYSTEMS, PTS 1 AND 2, 2012, 195-196 : 1180 - 1185
  • [42] Entity Typing Using Distributional Semantics and DBpedia
    van Erp, Marieke
    Vossen, Piek
    KNOWLEDGE GRAPHS AND LANGUAGE TECHNOLOGY, 2017, 10579 : 102 - 118
  • [43] Scalable Visualization of DBpedia Ontology Using Hadoop
    Kim, Sung-min
    Park, Seong-hun
    Ha, Young-guk
    ACTIVE MEDIA TECHNOLOGY, AMT 2013, 2013, 8210 : 301 - 306
  • [44] Ontology Attention Layer for Medical Named Entity Recognition
    Zha, Yue
    Ke, Yuanzhi
    Hu, Xiao
    Xiong, Caiquan
    APPLIED SCIENCES-BASEL, 2024, 14 (01):
  • [45] Named entity recognition in the legal domain for ontology population
    Bruckschen, Mirian
    Northfleet, Caio
    da Silva, Douglas
    Bridi, Paulo
    Granada, Roger
    Vieira, Renata
    Rao, Prasad
    Sander, Tomas
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : I16 - I21
  • [46] Fine-Grained Named Entity Classification with Wikipedia Article Vectors
    Suzuki, Masatoshi
    Matsuda, Koji
    Sekine, Satoshi
    Okazaki, Naoaki
    Inui, Kentaro
    2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2016), 2016, : 483 - 486
  • [47] A model for ranking entity attributes using DBpedia
    Alahmari, Fahad
    Thom, James A.
    Magee, Liam
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2014, 66 (05) : 473 - 493
  • [48] Czech Historical Named Entity Corpus v 1.0
    Hubkova, Helena
    Kral, Pavel
    Pettersson, Eva
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4458 - 4465
  • [49] An Open Corpus for Named Entity Recognition in Historic Newspapers
    Neudecker, Clemens
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4348 - 4352
  • [50] Information Extraction based on Named Entity for Tourism Corpus
    Chantrapornchai, Chantana
    Tunsakul, Aphisit
    2019 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE 2019), 2019, : 187 - 192