Automatic extension of corpora from the intelligent ensembling of eHealth knowledge discovery systems outputs

被引:2
|
作者
Pablo Consuegra-Ayala, Juan [1 ]
Gutierrez, Yoan [2 ,3 ]
Piad-Morffis, Alejandro [1 ]
Almeida-Cruz, Yudivian [1 ]
Palomar, Manuel [2 ,3 ]
机构
[1] Univ Habana, Sch Math & Comp Sci, Havana 10200, Cuba
[2] Univ Alicante, Univ Inst Comp Res IUII, Alicante 03690, Spain
[3] Univ Alicante, Dept Language & Comp Syst, Alicante 03690, Spain
关键词
Ensemble methods; Annotated corpora; Information extraction; Entity recognition; Relation extraction; Natural language processing;
D O I
10.1016/j.jbi.2021.103716
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Corpora are one of the most valuable resources at present for building machine learning systems. However, building new corpora is an expensive task, which makes the automatic extension of corpora a highly attractive task to develop. Hence, finding new strategies that reduce the cost and effort involved in this task, while at the same time guaranteeing quality, remains an open and important challenge for the research community. In this paper, we present a set of ensembling strategies oriented toward entity and relation extraction tasks. The main goal is to combine several automatically annotated versions of corpora to produce a single version with improved quality. An ensembler is built by exploring a configuration space in search of the combination that maximizes the fitness of the ensembled collection according to a reference collection. The eHealth-KD 2019 challenge was chosen for the case study. The submitted systems? outputs were ensembled, resulting in the construction of an automatically annotated collection of 8000 sentences. We show that using this collection as additional training input for a baseline algorithm has a positive impact on its performance. Additionally, the ensembling pipeline was used as a participant system in the 2020 edition of the challenge. The ensembled run achieved a slightly better performance than the individual runs.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Intelligent agent-assisted decision support systems: Integration of knowledge discovery, knowledge analysis, and group decision support
    Wang, HQ
    EXPERT SYSTEMS WITH APPLICATIONS, 1997, 12 (03) : 323 - 335
  • [32] An Overview on the Hybrid Intelligent Systems from the Grey Systems Theory and Knowledge Perspective
    Scarlat, Emil
    Delcea, Camelia
    JOURNAL OF GREY SYSTEM, 2016, 28 (02): : 13 - 26
  • [33] Knowledge engineering for intelligent tutoring systems: Assessing semi-automatic skill encoding methods
    Kardian, Kevin
    Heffernan, Neil T.
    INTELLIGENT TUTORING SYSTEMS, PROCEEDINGS, 2006, 4053 : 735 - 737
  • [34] Automatic knowledge acquisition from complex processes for the development of knowledge-based systems
    R-Roda, I
    Comas, J
    Poch, M
    Sànchez-Marrè, M
    Cortés, U
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2001, 40 (15) : 3353 - 3360
  • [35] Generating intelligent tutoring systems from reusable components and knowledge-based systems
    El-Sheikh, E
    Sticklen, J
    INTELLIGENT TUTORING SYSTEMS, 2002, 2363 : 199 - 207
  • [36] Knowledge Discovery from On-line Cable Condition Monitoring Systems
    Song, Xiaodi
    Zhou, Chengke
    Hepburn, Donald M.
    Peng, Xiaosheng
    2010 ANNUAL REPORT CONFERENCE ON ELECTRICAL INSULATION AND DIELECTIC PHENOMENA, 2010,
  • [37] Moving from OPAC to Discovery Systems: Nigerian Librarians' Perceived Knowledge and Readiness
    Adeyemi, Ismail Olatunji
    Omopupa, Kamal Tunde
    CATALOGING & CLASSIFICATION QUARTERLY, 2020, 58 (02) : 149 - 168
  • [38] Knowledge Discovery from University Information Systems for Purposes of Quality Assurance Implementation
    Skalka, Jan
    Drlik, Martin
    Svec, Peter
    2013 IEEE GLOBAL ENGINEERING EDUCATION CONFERENCE (EDUCON), 2013, : 591 - 596
  • [39] Extension as a knowledge partner in farming systems research:: Early lessons from "FutureDairy" Australia
    Kenny, Sean
    Nettle, Ruth
    CHANGING EUROPEAN FARMING SYSTEMS FOR A BETTER FUTURE: NEW VISIONS FOR RURAL AREAS, 2006, : 325 - +
  • [40] Five useful properties of probabilistic knowledge representations from the point of view of intelligent systems
    Univ of Pittsburgh, Pittsburgh, United States
    Fundam Inf, 3-4 (241-254):