Challenges in Identifying Asthma Subgroups Using Unsupervised Statistical Learning Techniques

被引:38
|
作者
Prosperi, Mattia C. F. [1 ,2 ]
Sahiner, Umit M. [3 ]
Belgrave, Danielle [1 ,2 ]
Sackesen, Cansin [3 ]
Buchan, Iain E. [1 ]
Simpson, Angela [2 ]
Yavuz, Tolga S. [3 ]
Kalayci, Omer [3 ]
Custovic, Adnan [2 ]
机构
[1] Univ Manchester, Inst Populat Hlth, Ctr Hlth Informat, Manchester M13 9PL, Lancs, England
[2] Univ Manchester, Inst Inflammat & Repair, Ctr Resp Med & Allergy, Manchester M13 9PL, Lancs, England
[3] Hacettepe Univ, Sch Med, Pediat Allergy & Asthma Unit, Ankara, Turkey
关键词
asthma; children; clustering; machine learning; endotypes; CLUSTER-ANALYSIS; CHILDREN; EXERCISE; PHENOTYPES; MIXTURE; SELECTION; SYMPTOMS; TREE;
D O I
10.1164/rccm.201304-0694OC
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
Rationale: Unsupervised statistical learning techniques, such as exploratory factor analysis (EFA) and hierarchical clustering (HC), have been used to identify asthma phenotypes, with partly consistent results. Some of the inconsistency is caused by the variable selection and demographic and clinical differences among study populations. Objectives: To investigate the effects of the choice of statistical method and different preparations of data on the clustering results; and to relate these to disease severity. Methods: Several variants of EFA and HC were applied and compared using various sets of variables and different encodings and transformations within a dataset of 383 children with asthma. Variables included lung function, inflammatory and allergy markers, family history, environmental exposures, and medications. Clusters and original variables were related to asthma severity (logistic regression and Bayesian network analysis). Measurements and Main Results: EFA identified five components (eigenvalues >= 1) explaining 35% of the overall variance. Variations of the HC (as linkage-distance functions) did not affect the cluster inference; however, using different variable encodings and transformations did. The derived clusters predicted asthma severity less than the original variables. Prognostic factors of severity were medication usage, current symptoms, lung function, paternal asthma, body mass index, and age of asthma onset. Bayesian networks indicated conditional dependence among variables. Conclusions: The use of different unsupervised statistical learning methods and different variable sets and encodings can lead to multiple and inconsistent subgroupings of asthma, not necessarily correlated with severity. The search for asthma phenotypes needs more careful selection of markers, consistent across different study populations, and more cautious interpretation of results from unsupervised learning.
引用
收藏
页码:1303 / 1312
页数:10
相关论文
共 50 条
  • [1] Challenges in identifying asthma endotypes using unsupervised learning techniques
    Prosperi, M. C. F.
    Sahiner, U. M.
    Belgrave, D.
    Buchan, I. E.
    Sackesen, C.
    Simpson, A.
    Yavuz, T. S.
    Custovic, A.
    Kalayci, O.
    [J]. ALLERGY, 2013, 68 : 668 - 669
  • [2] Identifying Phenogroups in patients with subclinical diastolic dysfunction using unsupervised statistical learning
    Yvonne E. Kaptein
    Ilya Karagodin
    Hongquan Zuo
    Yu Lu
    Jun Zhang
    John S. Kaptein
    Jennifer L. Strande
    [J]. BMC Cardiovascular Disorders, 20
  • [3] Identifying Phenogroups in patients with subclinical diastolic dysfunction using unsupervised statistical learning
    Kaptein, Yvonne E.
    Karagodin, Ilya
    Zuo, Hongquan
    Lu, Yu
    Zhang, Jun
    Kaptein, John S.
    Strande, Jennifer L.
    [J]. BMC CARDIOVASCULAR DISORDERS, 2020, 20 (01)
  • [4] Identifying novel subgroups in heart failure patients with unsupervised machine learning: A scoping review
    Sun, Jin
    Guo, Hua
    Wang, Wenjun
    Wang, Xiao
    Ding, Junyu
    He, Kunlun
    Guan, Xizhou
    [J]. FRONTIERS IN CARDIOVASCULAR MEDICINE, 2022, 9
  • [5] Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges
    Usama, Muhammad
    Qadir, Junaid
    Raza, Aunn
    Arif, Hunain
    Yau, Kok-Lim Alvin
    Elkhatib, Yehia
    Hussain, Amir
    Al-Fuqaha, Ala
    [J]. IEEE ACCESS, 2019, 7 : 65579 - 65615
  • [6] Identifying schizophrenia subgroups using clustering and supervised learning
    Talpalaru, Alexandra
    Bhagwat, Nikhil
    Devenyi, Gabriel A.
    Lepage, Martin
    Chakravarty, M. Mallar
    [J]. SCHIZOPHRENIA RESEARCH, 2019, 214 : 51 - 59
  • [7] Identifying Uncertain Galaxy Morphologies Using Unsupervised Learning
    Edwards, Kieran Jay
    Gaber, Mohamed Medhat
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2013, 7895 : 146 - 157
  • [8] Analysis of Sentiments using Unsupervised Learning Techniques
    Usha, M. S.
    Devi, M. Indra
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 241 - 245
  • [9] Identifying the Opportunities and Challenges of Project Bundling: Modeling and Discovering Key Patterns Using Unsupervised Machine Learning
    Assaf, Ghiwa
    Assaad, Rayan H.
    Karaa, Fadi
    [J]. JOURNAL OF INFRASTRUCTURE SYSTEMS, 2024, 30 (01)
  • [10] Unsupervised learning for hierarchical clustering using statistical information
    Okamoto, M
    Bu, N
    Tsuji, T
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2004, PT 1, 2004, 3173 : 834 - 839