Incremental information extraction using tree-based context representations

被引:0
|
作者
Siefkes, C [1 ]
机构
[1] Free Univ Berlin, Berlin Brandenburg Grad Sch Distributed Informat, Database & Informat Syst Grp, D-14195 Berlin, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of information extraction (IE) is to find desired pieces of information in natural language texts and store them in a form that is suitable for automatic processing. Providing annotated training data to adapt a trainable IE system to a new domain requires a considerable amount of work. To address this, we explore incremental learning. Here training documents are annotated sequentially by a user and immediately incorporated into the extraction model. Thus the system can support the user by proposing extractions based on the current extraction model, reducing the workload of the user over time. We introduce an approach to modeling IE as a token classification task that allows incremental training. To provide sufficient information to the token classifiers, we use rich, tree-based context representations of each token as feature vectors. These representations make use of the heuristically deduced document structure in addition to linguistic and semantic information. We consider the resulting feature vectors as ordered and combine proximate features into more expressive joint features, called "Orthogonal Sparse Bigrams" (OSB). Our results indicate that this setup makes it possible to employ IE in an incremental fashion without a serious performance penalty.
引用
收藏
页码:510 / 521
页数:12
相关论文
共 50 条
  • [41] Tree-based Context Clustering Using Speech Recognition Features for Acoustic Model Training of Speech Synthesis
    Chanjaradwichai, Supadaech
    Suchato, Atiwong
    Punyabukkana, Proadpran
    2015 12TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2015,
  • [42] Video classification using a tree-based RBF network
    Gillespie, WJ
    Nguyen, DT
    2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 3753 - 3756
  • [43] Hierarchical Segmentation Using Tree-Based Shape Spaces
    Xu, Yongchao
    Carlinet, Edwin
    Eraud, Thierry G.
    Najman, Laurent
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (03) : 457 - 469
  • [44] Tree-based disease classification using protein data
    Zhu, HT
    Yu, CY
    Zhang, HP
    PROTEOMICS, 2003, 3 (09) : 1673 - 1677
  • [45] Collision frequency analysis using tree-based stratification
    Park, YJ
    Saccomanno, FF
    STATISTICAL METHODS; HIGHWAY SAFETY DATA, ANALYSIS, AND EVALUATION; OCCUPANT PROTECTION; SYSTEMATIC REVIEWS AND META-ANALYSIS, 2005, (1908): : 121 - 129
  • [46] Travel Time Prediction Using Tree-Based Ensembles
    Huang, He
    Pouls, Martin
    Meyer, Anne
    Pauly, Markus
    COMPUTATIONAL LOGISTICS, ICCL 2020, 2020, 12433 : 412 - 427
  • [47] ATM Allocation Using Decision Tree-Based Algorithms
    Yurdakul, Hazal Hasret
    Kasikci, Kerem
    Cagatay, Ilhan
    Guven, Melih
    Koras, Murat
    Akgun, Baris
    Gonen, Mehmet
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [48] Exploiting Categorical Structure Using Tree-Based Methods
    Lucena, Brian
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2949 - 2957
  • [49] Using PhyloSuite for molecular phylogeny and tree-based analyses
    Xiang, Chuan-Yu
    Gao, Fangluan
    Jakovlic, Ivan
    Lei, Hong-Peng
    Hu, Ye
    Zhang, Hong
    Zou, Hong
    Wang, Gui-Tang
    Zhang, Dong
    IMETA, 2023, 2 (01):
  • [50] Predicting employee attrition using tree-based models
    El-Rayes, Nesreen
    Fang, Ming
    Smith, Michael
    Taylor, Stephen M.
    INTERNATIONAL JOURNAL OF ORGANIZATIONAL ANALYSIS, 2020, 28 (06) : 1273 - 1291