A Bootstrapping Algorithm for Geo-Entity Relation Extraction from Online Encyclopedia

被引:0
|
作者
Yu, Li [1 ]
Lu, Feng [1 ]
机构
[1] Chinese Acad Sci, IGSNRR, State Key Lab Resources & Environm Informat Syst, Beijing, Peoples R China
关键词
web texts; text mining; geo-entity; relation extraction; bootstrapping;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extracting spatial and semantic relations between two geo-entities from web texts, is one core problem of geographical information retrieval. The primary methods are pattern matching and supervised learning. Since the coverage of patterns is limited due to poor adaptability and supervised learning needs a large number of labeled data that expensive, both are hard to process the massive and diverse web texts. Inspired by frequency statistics, which is an important technique of unsupervised relation extraction, this paper puts forward a novel approach to automatically extracting geo-entity relations without much manual effort. Firstly, we translate relation extraction to a problem of keyword extraction, and analyze the characteristics of word (part-of-speech, position and distance to entity) by means of bootstrapping. Secondly, calculate the weight of each word in one sentence with a pair of geo-entities based on the statistic results of characteristics, and pick out the word with the maximum weight as the relation of one pair of geo-entities. Lastly, we construct relation instances to obtain the structured information. In the experiment, we used bootstrapping to evaluate the precision and recall based on popular Sin a Travel and BaiduBaike in Chinese and compared with three frequency statistic approaches (Frequency, TF-IDF and PPMI). The presented method is argued has following advantages: (1) it can automatically explore the lexical features from natural language texts, which neither the domain expert knowledge nor large scale corpora need, and breaks the restriction of closed relation types. (2) Compared with three classical frequency statistics methods, the precision and recall are improved by 5% and 23% respectively.
引用
收藏
页数:5
相关论文
共 37 条
  • [1] Context Enhanced Keyword Extraction for Sparse Geo-Entity Relation from Web Texts
    Yu, Li
    Lu, Feng
    Zhang, Xueying
    Liu, Xiliang
    [J]. Web Technologies and Applications: APWeb 2016 Workshops, WDMA, GAP, and SDMA, 2016, 9865 : 253 - 264
  • [2] Bootstrapping Joint Entity and Relation Extraction with Reinforcement Learning
    Xia, Min
    Cheng, Xiang
    Su, Sen
    Kuang, Ming
    Li, Gang
    [J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2022, 2022, 13724 : 418 - 432
  • [3] Reducing Semantic Drift in Bootstrapping for Entity Relation Extraction
    Chen Sijia
    Li Yan
    Chen Guang
    [J]. PROCEEDINGS 2013 INTERNATIONAL CONFERENCE ON MECHATRONIC SCIENCES, ELECTRIC ENGINEERING AND COMPUTER (MEC), 2013, : 1947 - 1950
  • [4] A named entity relation extraction method based on bootstrapping
    He Tingting
    Xu Chao
    Li Jing
    Zhao Junzhe
    [J]. 2005 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND TECHNOLOGY, PROCEEDINGS, 2005, : 758 - 763
  • [5] Entity Recognition and Relations Extraction Based on the Structure of Online Encyclopedia
    Song, Qing
    Yang, Yue
    [J]. 3RD INTERNATIONAL CONFERENCE ON APPLIED COMPUTING AND INFORMATION TECHNOLOGY (ACIT 2015) 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND INTELLIGENCE (CSI 2015), 2015, : 478 - 482
  • [6] Bootstrapping Multilingual Relation Discovery Using English Wikipedia and Wikimedia-Induced Entity Extraction
    Schone, Patrick
    Allison, Tim
    Giannella, Chris
    Pfeifer, Craig
    [J]. 2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 944 - 951
  • [7] Relation Extraction and Discovery from Free Texts via Bootstrapping
    Yang, Yunlong
    Luo, Jie
    [J]. 2017 10TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2017, : 116 - 121
  • [8] Novel and efficient algorithm for entity relation extraction with the corpus knowledge graph
    Hu, Daiwang
    Jiao, Yiyuan
    Li, Yanni
    [J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2021, 48 (06): : 75 - 83
  • [9] A cognitive-related entity and relation extraction model for online tutoring systems
    Zhu, Mengmeng
    Zhao, Defang
    Yang, Juan
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG 2017), 2017, : 113 - 119
  • [10] Unsupervised entity and relation extraction from clinical records in Italian
    Alicante, Anita
    Corazza, Anna
    Isgro, Francesco
    Silvestri, Stefano
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2016, 72 : 263 - 275