Incremental information extraction using tree-based context representations

被引:0
|
作者
Siefkes, C [1 ]
机构
[1] Free Univ Berlin, Berlin Brandenburg Grad Sch Distributed Informat, Database & Informat Syst Grp, D-14195 Berlin, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of information extraction (IE) is to find desired pieces of information in natural language texts and store them in a form that is suitable for automatic processing. Providing annotated training data to adapt a trainable IE system to a new domain requires a considerable amount of work. To address this, we explore incremental learning. Here training documents are annotated sequentially by a user and immediately incorporated into the extraction model. Thus the system can support the user by proposing extractions based on the current extraction model, reducing the workload of the user over time. We introduce an approach to modeling IE as a token classification task that allows incremental training. To provide sufficient information to the token classifiers, we use rich, tree-based context representations of each token as feature vectors. These representations make use of the heuristically deduced document structure in addition to linguistic and semantic information. We consider the resulting feature vectors as ordered and combine proximate features into more expressive joint features, called "Orthogonal Sparse Bigrams" (OSB). Our results indicate that this setup makes it possible to employ IE in an incremental fashion without a serious performance penalty.
引用
收藏
页码:510 / 521
页数:12
相关论文
共 50 条
  • [21] Incremental tree-based successive POI recommendation in location-based social networks
    Amirat, Hanane
    Lagraa, Nasreddine
    Fournier-Viger, Philippe
    Ouinten, Youcef
    Kherfi, Mohammed Lamine
    Guellouma, Younes
    APPLIED INTELLIGENCE, 2023, 53 (07) : 7562 - 7598
  • [22] Incremental tree-based successive POI recommendation in location-based social networks
    Hanane Amirat
    Nasreddine Lagraa
    Philippe Fournier-Viger
    Youcef Ouinten
    Mohammed Lamine Kherfi
    Younes Guellouma
    Applied Intelligence, 2023, 53 : 7562 - 7598
  • [23] Extraction of Incremental Information Using Query Evaluator
    Saste, Rasika P.
    Patil, Sachin S.
    2014 FIRST INTERNATIONAL CONFERENCE ON NETWORKS & SOFT COMPUTING (ICNSC), 2014, : 324 - 328
  • [24] Incremental Information Extraction Using Relational Databases
    Tari, Luis
    Phan Huy Tu
    Hakenberg, Joerg
    Chen, Yi
    Tran Cao Son
    Gonzalez, Graciela
    Baral, Chitta
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (01) : 86 - 99
  • [25] Performance Analysis of Tree-Based Algorithms for Incremental High Utility Pattern Mining
    Ryang, Heungmo
    Yun, Unil
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2017, 421 : 127 - 131
  • [26] Context-dependent HMM modeling using tree-based clustering for the recognition of handwritten words
    Bianne, Anne-Laure
    Kermorvant, Christopher
    Likforman-Sulem, Laurence
    DOCUMENT RECOGNITION AND RETRIEVAL XVII, 2010, 7534
  • [27] DECISION TREE-BASED CONTEXT CLUSTERING BASED ON CROSS VALIDATION AND HIERARCHICAL PRIORS
    Zen, Heiga
    Gales, M. J. F.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4560 - 4563
  • [28] Tree-based classification and regression Part 3: Tree-based procedures
    Gunter, B
    QUALITY PROGRESS, 1998, 31 (02) : 121 - 123
  • [29] An Implementation of Decision Tree-Based Context Clustering on Graphics Processing Units
    Pilkington, Nicholas
    Zen, Heiga
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 833 - 836
  • [30] The predictability of tree-based machine learning algorithms in the big data context
    Qolipour F.
    Ghasemzadeh M.
    Mohammad-Karimi N.
    International Journal of Engineering, Transactions A: Basics, 2021, 34 (01): : 82 - 89