Mining Skeletal Phenotype Descriptions from Scientific Literature

被引:7
|
作者
Groza, Tudor [1 ]
Hunter, Jane [1 ]
Zankl, Andreas [2 ,3 ]
机构
[1] Univ Queensland, Sch ITEE, Brisbane, Qld 4072, Australia
[2] Univ Queensland, UQCCR, Bone Dysplasia Res Grp, Brisbane, Qld 4072, Australia
[3] Royal Brisbane & Womens Hosp, Genet Hlth Queensland, Herston, Qld, Australia
来源
PLOS ONE | 2013年 / 8卷 / 02期
基金
澳大利亚研究理事会;
关键词
PERFORMANCE; ONTOLOGY;
D O I
10.1371/journal.pone.0055656
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phenotype descriptions are important for our understanding of genetics, as they enable the computation and analysis of a varied range of issues related to the genetic and developmental bases of correlated characters. The literature contains a wealth of such phenotype descriptions, usually reported as free-text entries, similar to typical clinical summaries. In this paper, we focus on creating and making available an annotated corpus of skeletal phenotype descriptions. In addition, we present and evaluate a hybrid Machine Learning approach for mining phenotype descriptions from free text. Our hybrid approach uses an ensemble of four classifiers and experiments with several aggregation techniques. The best scoring technique achieves an F-1 score of 71.52%, which is close to the state-of-the-art in other domains, where training data exists in abundance. Finally, we discuss the influence of the features chosen for the model on the overall performance of the method.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Game-based learning in scientific literature: text mining analysis
    Garcia-Masso, Xavier
    Montalt-Garcia, Sergio
    Gonzalez, Luis-Millan
    REVISTA DE EDUCACION, 2024, (405): : 133 - 158
  • [42] GROWTH OF SCIENTIFIC BOUNDARY DESCRIPTIONS
    不详
    MINNESOTA LAW REVIEW, 1943, 27 (02) : 211 - 216
  • [44] A text mining framework for screening catalysts and critical process parameters from scientific literature - A study on Hydrogen production from alcohol
    Kumar, Avan
    Ganesh, Swathi
    Gupta, Divyanshi
    Kodamana, Hariprasad
    CHEMICAL ENGINEERING RESEARCH & DESIGN, 2022, 184 : 90 - 102
  • [45] Mining Domain Knowledge on Service Goals from Textual Service Descriptions
    Zhang, Neng
    Wang, Jian
    Ma, Yutao
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2020, 13 (03) : 488 - 502
  • [46] Phenotype mining in CNV carriers from a population cohort
    Pietilainen, Olli P. H.
    Rehnstrom, Karola
    Jakkula, Eveliina
    Service, Susan K.
    Congdon, Eliza
    Tilgmann, Carola
    Hartikainen, Anna-Liisa
    Taanila, Anja
    Heikura, Ulla
    Paunio, Tiina
    Ripatti, Samuli
    Jarvelin, Marjo-Riitta
    Isohanni, Matti
    Sabatti, Chiara
    Palotie, Aarno
    Freimer, Nelson B.
    Peltonen, Leena
    HUMAN MOLECULAR GENETICS, 2011, 20 (13) : 2686 - 2695
  • [47] DiMB-RE: mining the scientific literature for diet-microbiome associations
    Hong, Gibong
    Hindle, Veronica
    Veasley, Nadine M.
    Holscher, Hannah D.
    Kilicoglu, Halil
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2025,
  • [48] Identifying technologies in circular economy paradigm through text mining on scientific literature
    Giordano, Vito
    Castagnoli, Alessio
    Pecorini, Isabella
    Chiarello, Filippo
    PLOS ONE, 2024, 19 (12):
  • [49] Drug repurposing against Parkinson's disease by text mining the scientific literature
    Zhu, Yongjun
    Jung, Woojin
    Wang, Fei
    Che, Chao
    LIBRARY HI TECH, 2020, 38 (04) : 741 - 750
  • [50] Cats' and dogs' welfare: text mining and topics modeling analysis of the scientific literature
    Adamakopoulou, Chrysa
    Benedetti, Beatrice
    Zappaterra, Martina
    Felici, Martina
    Masebo, Naod Thomas
    Previti, Annalisa
    Passantino, Annamaria
    Padalino, Barbara
    FRONTIERS IN VETERINARY SCIENCE, 2023, 10