Active Learning with Adaptive Density Weighted Sampling for Information Extraction from Scientific Papers

被引:3
|
作者
Suvorov, Roman [1 ]
Shelmanov, Artem [1 ]
Smirnov, Ivan [1 ]
机构
[1] Russian Acad Sci, Fed Res Ctr Comp Sci & Control, Moscow, Russia
基金
俄罗斯基础研究基金会;
关键词
Information extraction; Deep linguistic analysis; Active machine learning; Scientific texts analysis;
D O I
10.1007/978-3-319-71746-3_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper addresses the task of information extraction from scientific literature with machine learning methods. In particular, the tasks of definition and result extraction from scientific publications in Russian are considered. We note that annotation of scientific texts for creation of training dataset is very labor insensitive and expensive process. To tackle this problem, we propose methods and tools based on active learning. We describe and evaluate a novel adaptive density-weighted sampling (ADWeS) meta-strategy for active learning. The experiments demonstrate that active learning can be a very efficient technique for scientific text mining, and the proposed meta-strategy can be beneficial for corpus annotation with strongly skewed class distribution. We also investigate informative task-independent features for information extraction from scientific texts and present an openly available tool for corpus annotation, which is equipped with ADWeS and compatible with well-known sampling strategies.
引用
收藏
页码:77 / 90
页数:14
相关论文
共 50 条
  • [1] Information Extraction of Extend Relation in Scientific Papers
    Sibaroni, Yuliant
    Widyantoro, Dwi H.
    Khodra, Masayu L.
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2016,
  • [2] Active learning via adaptive weighted uncertainty sampling applied to additive manufacturing
    van Houtum, Gijs J. J.
    Vlasea, Mihaela L.
    [J]. ADDITIVE MANUFACTURING, 2021, 48
  • [3] Novel reliability evaluation method combining active learning kriging and adaptive weighted importance sampling
    Chenghu Tang
    Feng Zhang
    Jianhua Zhang
    Yi Lv
    Gangfeng Wang
    [J]. Structural and Multidisciplinary Optimization, 2022, 65
  • [4] Novel reliability evaluation method combining active learning kriging and adaptive weighted importance sampling
    Tang, Chenghu
    Zhang, Feng
    Zhang, Jianhua
    Lv, Yi
    Wang, Gangfeng
    [J]. STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2022, 65 (09)
  • [5] Scientific Data Extraction from Oceanographic Papers
    Veyhe, Bartal Eyofnsson
    Sagi, Tomer
    Hose, Katja
    [J]. COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 800 - 804
  • [6] Adaptive sampling for active learning with genetic programming
    Ben Hamida, Sana
    Hmida, Hmida
    Borgi, Amel
    Rukoz, Marta
    [J]. COGNITIVE SYSTEMS RESEARCH, 2021, 65 : 23 - 39
  • [7] An adaptive approach to noisy annotations in scientific information extraction
    Bolucu, Necva
    Rybinski, Maciej
    Dai, Xiang
    Wan, Stephen
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (06)
  • [8] DeepCause: Hypothesis Extraction from Information Systems Papers with Deep Learning for Theory Ontology Learning
    Mueller, Roland M.
    Abdullaev, Sardor
    [J]. PROCEEDINGS OF THE 52ND ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2019, : 6250 - 6259
  • [9] Active Learning With Sampling by Uncertainty and Density for Data Annotations
    Zhu, Jingbo
    Wang, Huizhen
    Tsou, Benjamin K.
    Ma, Matthew
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1323 - 1331
  • [10] ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES
    Gad, Eyal En
    Gadde, Akshay
    Avestimehr, A. Salman
    Ortega, Antonio
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6175 - 6179