Enhancing selection of alcohol consumption-associated genes by random forest

被引:0
|
作者
Lyu, Chenglin [1 ,2 ]
Joehanes, Roby [3 ]
Huan, Tianxiao [3 ]
Levy, Daniel [3 ]
Li, Yi [1 ]
Wang, Mengyao [1 ]
Liu, Xue [1 ]
Liu, Chunyu [1 ]
Ma, Jiantao [4 ]
机构
[1] Boston Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02118 USA
[2] Boston Univ, Chobanian & Avedisian Sch Med, Dept Anat & Neurobiol, Boston, MA 02118 USA
[3] NHLBI, Framingham Heart Study & Populat Sci Branch, Framingham, MA 01702 USA
[4] Tufts Univ, Friedman Sch Nutr Sci & Policy, Nutr Epidemiol & Data Sci, Boston, MA 02111 USA
关键词
Alcohol consumption; Gene expression; CVD; Machine learning; random forest; Boruta; IDENTIFIES CANDIDATE GENES; RISK PREDICTION; HEART; DESIGN; BORUTA;
D O I
10.1017/S0007114524000795
中图分类号
R15 [营养卫生、食品卫生]; TS201 [基础科学];
学科分类号
100403 ;
摘要
Machine learning methods have been used in identifying omics markers for a variety of phenotypes. We aimed to examine whether a supervised machine learning algorithm can improve identification of alcohol-associated transcriptomic markers. In this study, we analysed array-based, whole-blood derived expression data for 17 873 gene transcripts in 5508 Framingham Heart Study participants. By using the Boruta algorithm, a supervised random forest (RF)-based feature selection method, we selected twenty-five alcohol-associated transcripts. In a testing set (30 % of entire study participants), AUC (area under the receiver operating characteristics curve) of these twenty-five transcripts were 0<middle dot>73, 0<middle dot>69 and 0<middle dot>66 for non-drinkers v. moderate drinkers, non-drinkers v. heavy drinkers and moderate drinkers v. heavy drinkers, respectively. The AUC of the selected transcripts by the Boruta method were comparable to those identified using conventional linear regression models, for example, AUC of 1958 transcripts identified by conventional linear regression models (false discovery rate < 0<middle dot>2) were 0<middle dot>74, 0<middle dot>66 and 0<middle dot>65, respectively. With Bonferroni correction for the twenty-five Boruta method-selected transcripts and three CVD risk factors (i.e. at P < 6<middle dot>7e-4), we observed thirteen transcripts were associated with obesity, three transcripts with type 2 diabetes and one transcript with hypertension. For example, we observed that alcohol consumption was inversely associated with the expression of DOCK4, IL4R, and SORT1, and DOCK4 and SORT1 were positively associated with obesity, and IL4R was inversely associated with hypertension. In conclusion, using a supervised machine learning method, the RF-based Boruta algorithm, we identified novel alcohol-associated gene transcripts.
引用
收藏
页码:2049 / 2057
页数:9
相关论文
共 50 条
  • [1] IDENTIFICATION OF GENES ASSOCIATED WITH ALCOHOL CONSUMPTION
    Schumann, G.
    Coin, L.
    Lourdusamy, A.
    Charoen, P.
    Stacey, D.
    Desrivieres, S.
    Jarvelin, M. R.
    Elliott, P.
    [J]. ALCOHOLISM-CLINICAL AND EXPERIMENTAL RESEARCH, 2010, 34 (08) : 43A - 43A
  • [2] Clonal relationship and alcohol consumption-associated mutational signature in synchronous hypopharyngeal tumours and oesophageal squamous cell carcinoma
    Josephine Mun-Yee Ko
    Chen Guo
    Conghui Liu
    Lvwen Ning
    Wei Dai
    Lihua Tao
    Anthony Wing-Ip Lo
    Carissa Wing-Yan Wong
    Ian Yu-Hong Wong
    Fion Siu-Yin Chan
    Claudia Lai-Yin Wong
    Kwan Kit Chan
    Tsz Ting Law
    Nikki Pui-Yue Lee
    Zhichao Liu
    Haoyao Jiang
    Zhigang Li
    Simon Law
    Maria Li Lung
    [J]. British Journal of Cancer, 2022, 127 : 2166 - 2174
  • [3] Clonal relationship and alcohol consumption-associated mutational signature in synchronous hypopharyngeal tumours and oesophageal squamous cell carcinoma
    Ko, Josephine Mun-Yee
    Guo, Chen
    Liu, Conghui
    Ning, Lvwen
    Dai, Wei
    Tao, Lihua
    Lo, Anthony Wing-Ip
    Wong, Carissa Wing-Yan
    Wong, Ian Yu-Hong
    Chan, Fion Siu-Yin
    Wong, Claudia Lai-Yin
    Chan, Kwan Kit
    Law, Tsz Ting
    Lee, Nikki Pui-Yue
    Liu, Zhichao
    Jiang, Haoyao
    Li, Zhigang
    Law, Simon
    Lung, Maria Li
    [J]. BRITISH JOURNAL OF CANCER, 2022, 127 (12) : 2166 - 2174
  • [4] IDENTIFICATION OF GENES ASSOCIATED WITH ALCOHOL CONSUMPTION AND ALCOHOL DEPENDENCE BY INTEGRATING "OMICS" DATA
    Kapoor, M.
    Wang, J. C.
    Farris, S. P.
    Edenberg, H.
    Liu, Y.
    Mayfield, D.
    Goate, A.
    [J]. ALCOHOLISM-CLINICAL AND EXPERIMENTAL RESEARCH, 2019, 43 : 278A - 278A
  • [5] Alcohol consumption-associated breast cancer incidence and potential effect modifiers: the Japan Public Health Center-based Prospective Study
    Suzuki, Reiko
    Iwasaki, Motoki
    Inoue, Manami
    Sasazuki, Shizuka
    Sawada, Norie
    Yamaji, Taiki
    Shimazu, Taichi
    Tsugane, Shoichiro
    [J]. INTERNATIONAL JOURNAL OF CANCER, 2010, 127 (03) : 685 - 695
  • [6] Licorice consumption-associated thunderclap headache: posterior reversible encephalopathy syndrome or subarachnoid hemorrhage?
    Hongliang Zhang
    Xiao-Feng Wang
    Jiang Wu
    [J]. Critical Care, 15 (2):
  • [7] Licorice consumption-associated thunderclap headache: posterior reversible encephalopathy syndrome or subarachnoid hemorrhage?
    Zhang, Hongliang
    Wang, Xiao-Feng
    Wu, Jiang
    [J]. CRITICAL CARE, 2011, 15 (02):
  • [8] LIMITED EVIDENCE THAT HISTORICAL CANDIDATE GENES FOR ALCOHOLISM ARE ASSOCIATED WITH ALCOHOL CONSUMPTION
    Mallard, T. T.
    Fromme, K.
    [J]. ALCOHOLISM-CLINICAL AND EXPERIMENTAL RESEARCH, 2018, 42 : 169A - 169A
  • [9] Enhancing Feature Selection for Imbalanced Alzheimer's Disease Brain MRI Images by Random Forest
    Wang, Xibin
    Zhou, Qiong
    Li, Hui
    Chen, Mei
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (12):
  • [10] EFFECTS OF THE MATERNAL CONSUMPTION OF ALCOHOL ON ALCOHOL SELECTION IN RATS
    REYES, E
    GARCIA, KD
    JONES, BC
    [J]. ALCOHOL, 1985, 2 (02) : 323 - 326