A machine learning approach to information extraction

被引:0
|
作者
Téllez-Valero, A
Montes-y-Gómez, M
Villaseñor-Pineda, L
机构
[1] INAOE, Dept Comp Sci, Language Technol Grp, Mexico City, DF, Mexico
[2] Univ Polytecn Valencia, Dept Informat Syst & Computat, Valencia, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information extraction is concerned with applying natural language processing to automatically extract the essential details from text documents. A great disadvantage of current approaches is their intrinsic dependence to the application domain and the target language. Several machine learning techniques have been applied in order to facilitate the portability of the information extraction systems. This paper describes a general method for building an information extraction system using regular expressions along with supervised learning algorithms. In this method, the extraction decisions are lead by a set of classifiers instead of sophisticated linguistic analyses. The paper also shows a system called TOPO that allows to extract the information related with natural disasters from newspaper articles in Spanish language. Experimental results of this system indicate that the proposed method can be a practical solution for building information extraction systems reaching an F-measure as high as 72%.
引用
收藏
页码:539 / 547
页数:9
相关论文
共 50 条
  • [1] A Supervised Machine Learning Approach for Temporal Information Extraction
    Kolya, Anup Kumar
    Ekbal, Asif
    Bandyopadhyay, Sivaji
    [J]. PROCEEDINGS OF THE 24TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2010, : 447 - 454
  • [2] A hybrid machine learning approach for information extraction from free text
    Neumann, G
    [J]. From Data and Information Analysis to Knowledge Engineering, 2006, : 390 - 397
  • [3] Information extraction from HTML']HTML: Application of a general machine learning approach
    Freitag, D
    [J]. FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 517 - 523
  • [4] Machine learning for information extraction in informal domains
    Freitag, D
    [J]. MACHINE LEARNING, 2000, 39 (2-3) : 169 - 202
  • [5] Machine Learning for Information Extraction in Informal Domains
    Dayne Freitag
    [J]. Machine Learning, 2000, 39 : 169 - 202
  • [6] A meta learning approach for open information extraction
    Han, Jiabao
    Wang, Hongzhi
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (15): : 12681 - 12694
  • [7] A meta learning approach for open information extraction
    Jiabao Han
    Hongzhi Wang
    [J]. Neural Computing and Applications, 2022, 34 : 12681 - 12694
  • [8] Intracranial Vascular Structure Extraction: A Machine Learning Approach
    Zhao, Shifeng
    Tian, Yun
    Wang, Xuesong
    Xie, Lizhi
    Sun, Lingling
    [J]. IEEE ACCESS, 2019, 7 : 100933 - 100942
  • [9] A novel machine learning approach for scene text extraction
    Ansari, Ghulam Jillani
    Shah, Jamal Hussain
    Yasmin, Mussarat
    Sharif, Muhammad
    Fernandes, Steven Lawrence
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 87 : 328 - 340
  • [10] Application of Fuzzy Clustering and DM in Information Extraction of Machine Learning
    Qu Zhiming
    [J]. PROCEEDINGS OF THE 2009 SECOND PACIFIC-ASIA CONFERENCE ON WEB MINING AND WEB-BASED APPLICATION, 2009, : 3 - 6