Modeling healthcare data using multiple-channel latent Dirichlet allocation

被引:58
|
作者
Lu, Hsin-Min [1 ]
Wei, Chih-Ping [1 ]
Hsiao, Fei-Yuan [2 ,3 ,4 ]
机构
[1] Natl Taiwan Univ, Coll Management, Dept Informat Management, Taipei 106, Taiwan
[2] Natl Taiwan Univ, Grad Inst Clin Pharm, Coll Med, Taipei 100, Taiwan
[3] Natl Taiwan Univ, Sch Pharm, Coll Med, Taipei 100, Taiwan
[4] Natl Taiwan Univ Hosp, Dept Pharm, Taipei 100, Taiwan
关键词
Healthcare data mining; Health informatics; Multiple-channel latent Dirichlet allocation; Diagnosis-medication associations; Medication prediction; Diagnosis prediction; CLINICAL DOCUMENTS; KNOWLEDGE-BASE; POPULATION; PATTERNS;
D O I
10.1016/j.jbi.2016.02.003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Information and communications technologies have enabled healthcare institutions to accumulate large amounts of healthcare data that include diagnoses, medications, and additional contextual information such as patient demographics. To gain a better understanding of big healthcare data and to develop better data-driven clinical decision support systems, we propose a novel multiple -channel latent Dirichlet allocation (MCLDA) approach for modeling diagnoses, medications, and contextual information in healthcare data. The proposed MCLDA model assumes that a latent health status group structure is responsible for the observed co-occurrences among diagnoses, medications, and contextual information. Using a real-world research testbed that includes one million healthcare insurance claim records, we investigate the utility of MCLDA. Our empirical evaluation results suggest that MCLDA is capable of capturing the comorbidity structures and linking them with the distribution of medications. Moreover, MCLDA is able to identify the pairing between diagnoses and medications in a record based on the assigned latent groups. MCLDA can also be employed to predict missing medications or diagnoses given partial records. Our evaluation results also show that, in most cases, MCLDA outperforms alternative methods such as logistic regressions and the k-nearest-neighbor (KNN) model for two prediction tasks, i.e., medication and diagnosis prediction. Thus, MCLDA represents a promising approach to modeling healthcare data for clinical decision support. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:210 / 223
页数:14
相关论文
共 50 条
  • [21] Joint Sensing and Power Allocation in Multiple-Channel Cognitive Radio Networks
    Yu, Huogen
    Tang, Wanbin
    Li, Shaoqian
    [J]. IEICE TRANSACTIONS ON COMMUNICATIONS, 2012, E95B (02) : 672 - 675
  • [22] A PERCEPTUAL HASHING ALGORITHM USING LATENT DIRICHLET ALLOCATION
    Vretos, Nicholas
    Nikolaidis, Nikos
    Pitas, Ioannis
    [J]. ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 362 - 365
  • [23] Using Latent Dirichlet Allocation for Automatic Categorization of Software
    Tian, Kai
    Revelle, Meghan
    Poshyvanyk, Denys
    [J]. 2009 6TH IEEE INTERNATIONAL WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES, 2009, : 163 - 166
  • [24] Unsupervised Language Filtering using the Latent Dirichlet Allocation
    Zhang, Wei
    Clark, Robert A. J.
    Wang, Yongyuan
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1268 - 1272
  • [25] Land cover harmonization using Latent Dirichlet Allocation
    Li, Zhan
    White, Joanne C.
    Wulder, Michael A.
    Hermosilla, Txomin
    Davidson, Andrew M.
    Comber, Alexis J.
    [J]. INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2021, 35 (02) : 348 - 374
  • [26] Predicting Component Failures Using Latent Dirichlet Allocation
    Liu, Hailin
    Xu, Ling
    Yang, Mengning
    Yan, Meng
    Zhang, Xiaohong
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [27] Using Latent Dirichlet Allocation for Topic Modelling in Twitter
    Ostrowski, David Alfred
    [J]. 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2015, : 493 - 497
  • [28] Topic Modelling Twitter Data with Latent Dirichlet Allocation Method
    Negara, Edi Surya
    Triadi, Dendi
    Andryani, Ria
    [J]. 2019 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICECOS 2019), 2019, : 386 - 390
  • [29] Inference of topics with Latent Dirichlet Allocation for Open Government Data
    Felipe da Silva, Nadia Felix
    da Silva, Nubia Rosa
    Cassiano, Katia Kelvis
    Cordeiro, Douglas Farias
    [J]. PERSPECTIVAS EM CIENCIA DA INFORMACAO, 2021, 26 (01): : 57 - 79
  • [30] Semantic similarity measure for topic modeling using latent Dirichlet allocation and collapsed Gibbs sampling
    Micheal Olalekan Ajinaja
    Adebayo Olusola Adetunmbi
    Chukwuemeka Christian Ugwu
    Olugbemiga Solomon Popoola
    [J]. Iran Journal of Computer Science, 2023, 6 (1) : 81 - 94