A Bootstrapping Algorithm for Geo-Entity Relation Extraction from Online Encyclopedia

被引:0
|
作者
Yu, Li [1 ]
Lu, Feng [1 ]
机构
[1] Chinese Acad Sci, IGSNRR, State Key Lab Resources & Environm Informat Syst, Beijing, Peoples R China
关键词
web texts; text mining; geo-entity; relation extraction; bootstrapping;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extracting spatial and semantic relations between two geo-entities from web texts, is one core problem of geographical information retrieval. The primary methods are pattern matching and supervised learning. Since the coverage of patterns is limited due to poor adaptability and supervised learning needs a large number of labeled data that expensive, both are hard to process the massive and diverse web texts. Inspired by frequency statistics, which is an important technique of unsupervised relation extraction, this paper puts forward a novel approach to automatically extracting geo-entity relations without much manual effort. Firstly, we translate relation extraction to a problem of keyword extraction, and analyze the characteristics of word (part-of-speech, position and distance to entity) by means of bootstrapping. Secondly, calculate the weight of each word in one sentence with a pair of geo-entities based on the statistic results of characteristics, and pick out the word with the maximum weight as the relation of one pair of geo-entities. Lastly, we construct relation instances to obtain the structured information. In the experiment, we used bootstrapping to evaluate the precision and recall based on popular Sin a Travel and BaiduBaike in Chinese and compared with three frequency statistic approaches (Frequency, TF-IDF and PPMI). The presented method is argued has following advantages: (1) it can automatically explore the lexical features from natural language texts, which neither the domain expert knowledge nor large scale corpora need, and breaks the restriction of closed relation types. (2) Compared with three classical frequency statistics methods, the precision and recall are improved by 5% and 23% respectively.
引用
收藏
页数:5
相关论文
共 37 条
  • [21] Automatic entity extraction from an N-ary relation: Toward a general law for information decomposition
    Jaoua, A
    Ounalli, H
    Belkhiter, N
    [J]. INFORMATION SCIENCES, 1995, 87 (1-3) : 153 - 169
  • [22] Joint extraction of entity relations from geological reports based on a novel relation graph convolutional network
    Tian, Miao
    Ma, Kai
    Wu, Qirui
    Qiu, Qinjun
    Tao, Liufeng
    Xie, Zhong
    [J]. COMPUTERS & GEOSCIENCES, 2024, 187
  • [23] Relating Relations: Meta-Relation Extraction from Online Health Forum Posts
    Stickley, Daniel
    [J]. EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 129 - 136
  • [24] Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach
    Shaina Raza
    Brian Schwartz
    [J]. BMC Medical Informatics and Decision Making, 23
  • [25] Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach
    Raza, Shaina
    Schwartz, Brian
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [26] API Entity and Relation Joint Extraction from Text via Dynamic Prompt-tuned Language Model
    Huang, Qing
    Sun, Yanbang
    Xing, Zhenchang
    Yu, Min
    Xu, Xiwei
    Lu, Qinghua
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (01)
  • [27] A neural network-based joint learning approach for biomedical entity and relation extraction from biomedical literature
    Luo, Ling
    Yang, Zhihao
    Cao, Mingyu
    Wang, Lei
    Zhang, Yin
    Lin, Hongfei
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 103
  • [28] Entity relation extraction from electronic medical records based on improved annotation rules and BiLSTM-CRF
    Chen, Tingyin
    Hu, Yongmei
    [J]. ANNALS OF TRANSLATIONAL MEDICINE, 2021, 9 (18)
  • [29] An Entity-Relation Joint Extraction Method Based on Two Independent Sub-Modules From Unstructured Text
    Liu, Su
    Lyu, Wenqi
    Ma, Xiao
    Ge, Jike
    [J]. IEEE ACCESS, 2023, 11 : 122154 - 122163
  • [30] Automating Feature Extraction from Entity-Relation Models: Experimental Evaluation of Machine Learning Methods for Relational Learning
    Stanoev, Boris
    Mitrov, Goran
    Kulakov, Andrea
    Mirceva, Georgina
    Lameski, Petre
    Zdravevski, Eftim
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (04)