Dynamic categorization of clinical research eligibility criteria by hierarchical clustering

被引:34
|
作者
Luo, Zhihui [1 ]
Yetisgen-Yildiz, Meliha [2 ]
Weng, Chunhua [1 ]
机构
[1] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
[2] Univ Washington, Seattle, WA 98195 USA
关键词
Clinical research eligibility criteria; Classification; Hierarchical clustering; Knowledge representation; Unified Medical Language System (UMLS); Machine learning; Feature representation; CLASSIFICATION;
D O I
10.1016/j.jbi.2011.06.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity. Design: The UMLS semantic types and a set of previously developed semantic preference rules were utilized to create an unambiguous semantic feature representation to induce eligibility criteria categories through hierarchical clustering and to train supervised classifiers. Measurements: We induced 27 categories and measured the prevalence of the categories in 27,278 eligibility criteria from 1578 clinical trials and compared the classification performance (i.e., precision, recall, and F1-score) between the UMLS-based feature representation and the "bag of words" feature representation among five common classifiers in Weka, including J48, Bayesian Network, Naive Bayesian, Nearest Neighbor, and instance-based learning classifier. Results: The UMLS semantic feature representation outperforms the "bag of words" feature representation in 89% of the criteria categories. Using the semantically induced categories, machine-learning classifiers required only 2000 instances to stabilize classification performance. The J48 classifier yielded the best F1-score and the Bayesian Network classifier achieved the best learning efficiency. Conclusion: The UMLS is an effective knowledge source and can enable an efficient feature representation for semi-automated semantic category induction and automatic categorization for clinical research eligibility criteria and possibly other clinical text. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:927 / 935
页数:9
相关论文
共 50 条
  • [1] Clustering clinical trials with similar eligibility criteria features
    Hao, Tianyong
    Rusanov, Alexander
    Boland, Mary Regina
    Weng, Chunhua
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 52 : 112 - 120
  • [2] A review of research on eligibility criteria for clinical trials
    Qianmin Su
    Gaoyi Cheng
    Jihan Huang
    [J]. Clinical and Experimental Medicine, 2023, 23 : 1867 - 1879
  • [3] A review of research on eligibility criteria for clinical trials
    Su, Qianmin
    Cheng, Gaoyi
    Huang, Jihan
    [J]. CLINICAL AND EXPERIMENTAL MEDICINE, 2023, 23 (06) : 1867 - 1879
  • [4] Semantic categorization of Chinese eligibility criteria in clinical trials using machine learning methods
    Hui Zong
    Jinxuan Yang
    Zeyu Zhang
    Zuofeng Li
    Xiaoyan Zhang
    [J]. BMC Medical Informatics and Decision Making, 21
  • [5] Semantic categorization of Chinese eligibility criteria in clinical trials using machine learning methods
    Zong, Hui
    Yang, Jinxuan
    Zhang, Zeyu
    Li, Zuofeng
    Zhang, Xiaoyan
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
  • [6] Research of hierarchical clustering based on dynamic granular computing
    Li, Xue-yong
    Sun, Jia-Xia
    Gao, Guo-Hong
    Fu, Jun-Hui
    [J]. Journal of Computers, 2011, 6 (12) : 2526 - 2533
  • [7] An OMOP CDM-Based Relational Database of Clinical Research Eligibility Criteria
    Si, Yuqi
    Weng, Chunhua
    [J]. MEDINFO 2017: PRECISION HEALTHCARE THROUGH INFORMATICS, 2017, 245 : 950 - 954
  • [8] Image Categorization Using a Heuristic Automatic Clustering Method Based on Hierarchical Clustering
    LaPlante, Francois
    Kardouchi, Mustapha
    Belacel, Nabil
    [J]. IMAGE ANALYSIS AND RECOGNITION (ICIAR 2015), 2015, 9164 : 150 - 158
  • [9] Towards Phenotyping of Clinical Trial Eligibility Criteria
    Loebe, Matthias
    Staeubert, Sebastian
    Goldberg, Colleen
    Haffner, Ivonne
    Winter, Alfred
    [J]. HEALTH INFORMATICS MEETS EHEALTH: BIOMEDICAL MEETS EHEALTH - FROM SENSORS TO DECISIONS, 2018, 248 : 293 - 299
  • [10] Simplified Criteria for Predicting Lung Cancer Screening Eligibility: A Potential Clinical and Research Tool
    Triplette, M.
    Donovan, L. M.
    Rise, P. J.
    Zeliadt, S. B.
    Madtes, D. K.
    Mularski, R. A.
    Lindenauer, P. K.
    Krishnan, J. A.
    Carson, S. S.
    McBurnie, M.
    Crothers, K. A.
    Au, D. H.
    [J]. AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2018, 197