Evaluation of supervised machine-learning methods for predicting appearance traits from DNA

被引:11
|
作者
Katsara, Maria-Alexandra [1 ]
Branicki, Wojciech [2 ]
Walsh, Susan [3 ]
Kayser, Manfred [4 ]
Nothnagel, Michael [1 ,5 ,6 ]
机构
[1] Univ Cologne, Cologne Ctr Genom, Cologne, Germany
[2] Jagiellonian Univ, Malopolska Ctr Biotechnol, Krakow, Poland
[3] Indiana Univ Purdue Univ Indianapolis IUPUI, Dept Biol, Indianapolis, IN USA
[4] Erasmus MC Univ Med Ctr Rotterdam, Dept Genet Identificat, Rotterdam, Netherlands
[5] Fac Med, Cologne, Germany
[6] Cologne Univ Hosp, Cologne, Germany
基金
欧盟地平线“2020”;
关键词
Externally visible characteristics; Predictive DNA analysis; Appearance prediction; Genetic prediction; DNA phenotyping; Forensic DNA phenotyping; Machine learning; Classifiers; GENOME-WIDE ASSOCIATION; SKIN COLOR PREDICTION; EYE COLOR; GENETIC-DETERMINANTS; PIGMENTATION; HAIR; SYSTEM; PHENOTYPES; COMPLEX; IMPACT;
D O I
10.1016/j.fsigen.2021.102507
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The prediction of human externally visible characteristics (EVCs) based solely on DNA information has become an established approach in forensic and anthropological genetics in recent years. While for a large set of EVCs, predictive models have already been established using multinomial logistic regression (MLR), the prediction performances of other possible classification methods have not been thoroughly investigated thus far. Motivated by the question to identify a potential classifier that outperforms these specific trait models, we conducted a systematic comparison between the widely used MLR and three popular machine learning (ML) classifiers, namely support vector machines (SVM), random forest (RF) and artificial neural networks (ANN), that have shown good performance outside EVC prediction. As examples, we used eye, hair and skin color categories as phenotypes and genotypes based on the previously established IrisPlex, HIrisPlex, and HIrisPlex-S DNA markers. We compared and assessed the performances of each of the four methods, complemented by detailed hyperparameter tuning that was applied to some of the methods in order to maximize their performance. Overall, we observed that all four classification methods showed rather similar performance, with no method being substantially superior to the others for any of the traits, although performances varied slightly across the different traits and more so across the trait categories. Hence, based on our findings, none of the ML methods applied here provide any advantage on appearance prediction, at least when it comes to the categorical pigmentation traits and the selected DNA markers used here.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] An evaluation of machine-learning methods for predicting pneumonia mortality
    Cooper, GF
    Aliferis, CF
    Ambrosino, R
    Aronis, J
    Buchanan, BG
    Caruana, R
    Fine, MJ
    Glymour, C
    Gordon, G
    Hanusa, BH
    Janosky, JE
    Meek, C
    Mitchell, T
    Richardson, T
    Spirtes, P
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 1997, 9 (02) : 107 - 138
  • [2] Predicting loss aversion behavior with machine-learning methods
    Ömür Saltık
    Wasim ul Rehman
    Rıdvan Söyü
    Süleyman Değirmen
    Ahmet Şengönül
    Humanities and Social Sciences Communications, 10
  • [3] Predicting loss aversion behavior with machine-learning methods
    Saltik, Omur
    ul Rehman, Wasim
    Soyu, Ridvan
    Degirmen, Suleyman
    Sengonul, Ahmet
    HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2023, 10 (01):
  • [4] Ensemble of Machine-Learning Methods for Predicting Gully Erosion Susceptibility
    Pal, Subodh Chandra
    Arabameri, Alireza
    Blaschke, Thomas
    Chowdhuri, Indrajit
    Saha, Asish
    Chakrabortty, Rabin
    Lee, Saro
    Band, Shahab. S.
    REMOTE SENSING, 2020, 12 (22) : 1 - 25
  • [5] Predicting Hydrological Drought Alert Levels Using Supervised Machine-Learning Classifiers
    Jehanzaib, Muhammad
    Shah, Sabab Ali
    Son, Ho Jun
    Jang, Sung-Hwan
    Kim, Tae-Woong
    KSCE JOURNAL OF CIVIL ENGINEERING, 2022, 26 (06) : 3019 - 3030
  • [6] Predicting Hydrological Drought Alert Levels Using Supervised Machine-Learning Classifiers
    Muhammad Jehanzaib
    Sabab Ali Shah
    Ho Jun Son
    Sung-Hwan Jang
    Tae-Woong Kim
    KSCE Journal of Civil Engineering, 2022, 26 : 3019 - 3030
  • [7] Predicting narcissistic personality traits from brain and psychological features: A supervised machine learning approach
    Jornkokgoud, Khanitin
    Baggio, Teresa
    Faysal, Md
    Bakiaj, Richard
    Wongupparaj, Peera
    Job, Remo
    Grecucci, Alessandro
    SOCIAL NEUROSCIENCE, 2023, 18 (05) : 257 - 270
  • [8] An evaluation of machine-learning for predicting phenotype: studies in yeast, rice, and wheat
    Nastasiya F. Grinberg
    Oghenejokpeme I. Orhobor
    Ross D. King
    Machine Learning, 2020, 109 : 251 - 277
  • [9] An evaluation of machine-learning for predicting phenotype: studies in yeast, rice, and wheat
    Grinberg, Nastasiya F.
    Orhobor, Oghenejokpeme I.
    King, Ross D.
    MACHINE LEARNING, 2020, 109 (02) : 251 - 277
  • [10] Machine learning: supervised methods
    Danilo Bzdok
    Martin Krzywinski
    Naomi Altman
    Nature Methods, 2018, 15 : 5 - 6