Machine Learning Approaches for the Prediction of Obesity using Publicly Available Genetic Profiles

被引:0
|
作者
Montanez, Casimiro Aday Curbelo [1 ]
Fergus, Paul [1 ]
Hussain, Abir [1 ]
Al-Jumeily, Dhiya [1 ]
Abdulaimma, Basma [1 ]
Hind, Jade [1 ]
Radi, Naeem [2 ]
机构
[1] Liverpool John Moores Univ, Dept Comp Sci, Liverpool, Merseyside, England
[2] Al Khawarizmi Int Coll, Abu Dhabi, U Arab Emirates
关键词
Data Science; Feature Selection; Genetics; Machine Learning; Obesity; Receiver Operating Characteristic Curve; Single Nucleotide Polymorphisms; GENOME-WIDE ASSOCIATION; SINGLE NUCLEOTIDE POLYMORPHISM; BIOINFORMATICS CHALLENGES; INDIVIDUALS; MODELS; SUSCEPTIBILITY; THRESHOLD; SELECTION; DATABASE; BORUTA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel approach based on the analysis of genetic variants from publicly available genetic profiles and the manually curated database, the National Human Genome Research Institute Catalog. Using data science techniques, genetic variants are identified in the collected participant profiles and then indexed as risk variants in the National Human Genome Research Institute Catalog. Indexed genetic variants or Single Nucleotide Polymorphisms are used as inputs in various machine learning algorithms for the prediction of obesity. Body mass index status of participants is divided into two classes, Normal Class and Risk Class. Dimensionality reduction tasks are performed to generate a set of principal variables - 13 SNPs - for the application of various machine learning methods. The models are evaluated using receiver operator characteristic curves and the area under the curve. Machine learning techniques including gradient boosting, generalized linear model, classification and regression trees, k-nearest neighbours, support vector machines, random forest and multilayer perceptron neural network are comparatively assessed in terms of their ability to identify the most important factors among the initial 6622 variables describing genetic variants, age and gender, to classify a subject into one of the body mass index related classes defined in this study. Our simulation results indicated that support vector machine generated the highest area under the curve value of 90.5%.
引用
收藏
页码:2743 / 2750
页数:8
相关论文
共 50 条
  • [1] Obesity Prediction Using Ensemble Machine Learning Approaches
    Jindal, Kapil
    Baliyan, Niyati
    Rana, Prashant Singh
    [J]. RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 355 - 362
  • [2] Machine Learning Approach to Improve Satellite Orbit Prediction Accuracy Using Publicly Available Data
    Peng, Hao
    Bai, Xiaoli
    [J]. Journal of the Astronautical Sciences, 2020, 67 (02): : 762 - 793
  • [3] Machine Learning Approach to Improve Satellite Orbit Prediction Accuracy Using Publicly Available Data
    Hao Peng
    Xiaoli Bai
    [J]. The Journal of the Astronautical Sciences, 2020, 67 : 762 - 793
  • [4] Machine Learning Approach to Improve Satellite Orbit Prediction Accuracy Using Publicly Available Data
    Peng, Hao
    Bai, Xiaoli
    [J]. JOURNAL OF THE ASTRONAUTICAL SCIENCES, 2020, 67 (02): : 762 - 793
  • [5] Obesity disease risk prediction using machine learning
    Dutta, Raja Ram
    Mukherjee, Indrajit
    Chakraborty, Chinmay
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [6] Thyroid Disease Prediction Using Machine Learning Approaches
    Chaubey, Gyanendra
    Bisen, Dhananjay
    Arjaria, Siddharth
    Yadav, Vibhash
    [J]. NATIONAL ACADEMY SCIENCE LETTERS-INDIA, 2021, 44 (03): : 233 - 238
  • [7] Liver Cirrhosis Prediction using Machine Learning Approaches
    Hanif, Ishtiaqe
    Khan, Mohammad Monirujjaman
    [J]. 2022 IEEE 13TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2022, : 28 - 34
  • [8] Redesigning plant specialized metabolism with supervised machine learning using publicly available reactome data
    Lim, Peng Ken
    Julca, Irene
    Mutwil, Marek
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 1639 - 1650
  • [9] Thyroid Disease Prediction Using Machine Learning Approaches
    Gyanendra Chaubey
    Dhananjay Bisen
    Siddharth Arjaria
    Vibhash Yadav
    [J]. National Academy Science Letters, 2021, 44 : 233 - 238
  • [10] DIABETES PREDICTION USING DIFFERENT MACHINE LEARNING APPROACHES
    Sonar, Priyanka
    JayaMalini, K.
    [J]. PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 367 - 371