Mining Skeletal Phenotype Descriptions from Scientific Literature

被引:7
|
作者
Groza, Tudor [1 ]
Hunter, Jane [1 ]
Zankl, Andreas [2 ,3 ]
机构
[1] Univ Queensland, Sch ITEE, Brisbane, Qld 4072, Australia
[2] Univ Queensland, UQCCR, Bone Dysplasia Res Grp, Brisbane, Qld 4072, Australia
[3] Royal Brisbane & Womens Hosp, Genet Hlth Queensland, Herston, Qld, Australia
来源
PLOS ONE | 2013年 / 8卷 / 02期
基金
澳大利亚研究理事会;
关键词
PERFORMANCE; ONTOLOGY;
D O I
10.1371/journal.pone.0055656
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phenotype descriptions are important for our understanding of genetics, as they enable the computation and analysis of a varied range of issues related to the genetic and developmental bases of correlated characters. The literature contains a wealth of such phenotype descriptions, usually reported as free-text entries, similar to typical clinical summaries. In this paper, we focus on creating and making available an annotated corpus of skeletal phenotype descriptions. In addition, we present and evaluate a hybrid Machine Learning approach for mining phenotype descriptions from free text. Our hybrid approach uses an ensemble of four classifiers and experiments with several aggregation techniques. The best scoring technique achieves an F-1 score of 71.52%, which is close to the state-of-the-art in other domains, where training data exists in abundance. Finally, we discuss the influence of the features chosen for the model on the overall performance of the method.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Expertise Mining from Scientific Literature
    Buitelaar, Paul
    Eigner, Thomas
    K-CAP'09: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE, 2009, : 171 - 172
  • [2] Decomposing phenotype Descriptions for the Human skeletal phenome
    Groza, Tudor
    Hunter, Jane
    Zankl, Andreas
    BIOMEDICAL INFORMATICS INSIGHTS, 2013, 6 : 1 - 14
  • [3] Mining Research Problems from Scientific Literature
    Aalla, Chanakya
    Pudi, Vikram
    PROCEEDINGS OF 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, (DSAA 2016), 2016, : 351 - 360
  • [4] Mining Semantic Descriptions of Bioinformatics Web Resources from the Literature
    Afzal, Hammad
    Stevens, Robert
    Nenadic, Goran
    SEMANTIC WEB: RESEARCH AND APPLICATIONS, 2009, 5554 : 535 - 549
  • [5] Towards Technology Structure Mining from Scientific Literature
    QasemiZadeh, Behrang
    SEMANTIC WEB-ISWC 2010, PT II, 2010, 6497 : 305 - 312
  • [6] Mining molecular interactions from scientific literature using Cloud Computing
    Nazareno, Franco
    Lee, Kyung-Hee
    Cho, Wan-Sup
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2010, : 864 - 865
  • [7] THE USE OF IQ AND DESCRIPTIONS OF PEOPLE WITH INTELLECTUAL DISABILITIES IN THE SCIENTIFIC LITERATURE
    Laird, Carmen
    Whitaker, Simon
    BRITISH JOURNAL OF DEVELOPMENTAL DISABILITIES, 2011, 57 (113): : 175 - 183
  • [8] Pitfalls in applying text mining to scientific literature
    Neefs, Jean-Marc
    BMC BIOINFORMATICS, 2010, 11
  • [9] Pitfalls in applying text mining to scientific literature
    Jean-Marc Neefs
    BMC Bioinformatics, 11
  • [10] Mining scientific literature to predict new relationships
    Huang, Wei
    Nakamori, Yoshiteru
    Wang, Shouyang
    Ma, Tieju
    INTELLIGENT DATA ANALYSIS, 2005, 9 (02) : 219 - 234