Modeling healthcare data using multiple-channel latent Dirichlet allocation

被引:58
|
作者
Lu, Hsin-Min [1 ]
Wei, Chih-Ping [1 ]
Hsiao, Fei-Yuan [2 ,3 ,4 ]
机构
[1] Natl Taiwan Univ, Coll Management, Dept Informat Management, Taipei 106, Taiwan
[2] Natl Taiwan Univ, Grad Inst Clin Pharm, Coll Med, Taipei 100, Taiwan
[3] Natl Taiwan Univ, Sch Pharm, Coll Med, Taipei 100, Taiwan
[4] Natl Taiwan Univ Hosp, Dept Pharm, Taipei 100, Taiwan
关键词
Healthcare data mining; Health informatics; Multiple-channel latent Dirichlet allocation; Diagnosis-medication associations; Medication prediction; Diagnosis prediction; CLINICAL DOCUMENTS; KNOWLEDGE-BASE; POPULATION; PATTERNS;
D O I
10.1016/j.jbi.2016.02.003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Information and communications technologies have enabled healthcare institutions to accumulate large amounts of healthcare data that include diagnoses, medications, and additional contextual information such as patient demographics. To gain a better understanding of big healthcare data and to develop better data-driven clinical decision support systems, we propose a novel multiple -channel latent Dirichlet allocation (MCLDA) approach for modeling diagnoses, medications, and contextual information in healthcare data. The proposed MCLDA model assumes that a latent health status group structure is responsible for the observed co-occurrences among diagnoses, medications, and contextual information. Using a real-world research testbed that includes one million healthcare insurance claim records, we investigate the utility of MCLDA. Our empirical evaluation results suggest that MCLDA is capable of capturing the comorbidity structures and linking them with the distribution of medications. Moreover, MCLDA is able to identify the pairing between diagnoses and medications in a record based on the assigned latent groups. MCLDA can also be employed to predict missing medications or diagnoses given partial records. Our evaluation results also show that, in most cases, MCLDA outperforms alternative methods such as logistic regressions and the k-nearest-neighbor (KNN) model for two prediction tasks, i.e., medication and diagnosis prediction. Thus, MCLDA represents a promising approach to modeling healthcare data for clinical decision support. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:210 / 223
页数:14
相关论文
共 50 条
  • [1] Topic Modeling Twitter Data Using Latent Dirichlet Allocation and Latent Semantic Analysis
    Qomariyah, Siti
    Iriawan, Nur
    Fithriasari, Kartika
    [J]. 2ND INTERNATIONAL CONFERENCE ON SCIENCE, MATHEMATICS, ENVIRONMENT, AND EDUCATION, 2019, 2019, 2194
  • [2] Topic Modeling Using Latent Dirichlet allocation: A Survey
    Chauhan, Uttam
    Shah, Apurva
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (07)
  • [3] Mining Web Log Data for News Topic Modeling Using Latent Dirichlet Allocation
    Surjandari, Isti
    Rosyidah, Asma
    Zulkarnain
    Laoh, Enrico
    [J]. 2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 331 - 335
  • [4] Topic modeling for expert finding using latent Dirichlet allocation
    Momtazi, Saeedeh
    Naumann, Felix
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 346 - 353
  • [5] Latent Dirichlet Allocation for Classification using Gene Expression Data
    Yalamanchili, Hima Bindu
    Kho, Soon Jye
    Raymer, Michael L.
    [J]. 2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 39 - 44
  • [6] Latent Dirichlet Allocation modeling of environmental microbiomes
    Kim, Anastasiia
    Sevanto, Sanna
    Moore, Eric R.
    Lubbers, Nicholas
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2023, 19 (06)
  • [7] Road Traffic Topic Modeling on Twitter using Latent Dirichlet Allocation
    Hidayatullah, Ahmad Fathan
    Ma'arif, Muhammad Rifqi
    [J]. 2017 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET), 2017, : 47 - 52
  • [8] ldagibbs: A command for topic modeling in Stata using latent Dirichlet allocation
    Schwarz, Carlo
    [J]. STATA JOURNAL, 2018, 18 (01): : 101 - 117
  • [9] A FRAMEWORK OF URDU TOPIC MODELING USING LATENT DIRICHLET ALLOCATION (LDA)
    Shakeel, Khadija
    Tahir, Ghulam Rasool
    Tehseen, Irsha
    Ali, Mubashir
    [J]. 2018 IEEE 8TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2018, : 117 - 123
  • [10] User Behavior Modeling in a Cellular Network Using Latent Dirichlet Allocation
    Giri, Ritwik
    Choi, Heesook
    Hoo, Kevin Soo
    Rao, Bhaskar D.
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2014, 2014, 8669 : 36 - 44