A single nucleotide polymorphism panel for individual identification and ancestry assignment in Caucasians and four East and Southeast Asian populations using a machine learning classifier

被引:0
|
作者
Hsiao-Lin Hwa
Ming-Yih Wu
Chih-Peng Lin
Wei Hsin Hsieh
Hsiang-I Yin
Tsui-Ting Lee
James Chun-I Lee
机构
[1] National Taiwan University,Department and Graduate Institute of Forensic Medicine, College of Medicine
[2] National Taiwan University Hospital,Department of Obstetrics and Gynecology
[3] National Taiwan University Hospital,Department of Medical Genetics
[4] Yourgene Bioscience,undefined
关键词
Ancestry assignment; Array; Individual identification; Machine learning classifier; Single nucleotide polymorphism; Support vector machine;
D O I
暂无
中图分类号
学科分类号
摘要
Single nucleotide polymorphism (SNP) profiling is an effective means of individual identification and ancestry inferences in forensic genetics. This study established a SNP panel for the simultaneous individual identification and ancestry assignment of Caucasian and four East and Southeast Asian populations. We analyzed 220 SNPs (125 autosomal, 17 X-chromosomal, 30 Y-chromosomal, and 48 mitochondrial SNPs) of the DNA samples from 563 unrelated individuals of five populations (89 Caucasian, 234 Taiwanese Han, 90 Filipino, 79 Indonesian and 71 Vietnamese) and 18 degraded DNA samples. Informativeness for assignment (In) was used to select ancestry informative SNPs (AISNPs). A machine learning classifier, support vector machine (SVM), was used for ancestry assignment. Of the 220 SNPs, 62 were individual identification SNPs (IISNPs) (51 autosomal and 11 X-chromosomal SNPs) and 191 were AISNPs (100 autosomal, 13 X-chromosomal, 30 Y-chromosomal, and 48 mitochondrial SNPs). The 51 autosomal IISNPs offered cumulative random match probabilities (cRMPs) ranging from 1.56 × 10−21 to 3.16 × 10−22 among these five populations. Using AISNPs with the SVM, the overall accuracy rate of ancestry inference achieved in the testing dataset between Caucasian, Taiwanese Han, and Filipino populations was 88.9%, whereas it was 70.0% between Caucasians and each of the four East and Southeast Asian populations. For the 18 degraded DNA samples with incomplete profiling, the accuracy rate of ancestry assignment was 94.4%. We have developed a 220-SNP panel for simultaneous individual identification and ethnic origin differentiation between Caucasian and the four East and Southeast Asian populations. This SNP panel may assist with DNA analysis of forensic casework.
引用
收藏
页码:67 / 74
页数:7
相关论文
共 3 条
  • [1] A single nucleotide polymorphism panel for individual identification and ancestry assignment in Caucasians and four East and Southeast Asian populations using a machine learning classifier
    Hwa, Hsiao-Lin
    Wu, Ming-Yih
    Lin, Chih-Peng
    Hsieh, Wei Hsin
    Yin, Hsiang-I
    Lee, Tsui-Ting
    Lee, James Chun-I
    FORENSIC SCIENCE MEDICINE AND PATHOLOGY, 2019, 15 (01) : 67 - 74
  • [2] A panel of 130 autosomal single-nucleotide polymorphisms for ancestry assignment in five Asian populations and in Caucasians
    Hsiao-Lin Hwa
    Chih-Peng Lin
    Tsun-Ying Huang
    Po-Hsiu Kuo
    Wei-Hsin Hsieh
    Chun-Yen Lin
    Hsiang-I Yin
    Li-Hui Tseng
    James Chun-I Lee
    Forensic Science, Medicine, and Pathology, 2017, 13 : 177 - 187
  • [3] A panel of 130 autosomal single-nucleotide polymorphisms for ancestry assignment in five Asian populations and in Caucasians
    Hwa, Hsiao-Lin
    Lin, Chih-Peng
    Huang, Tsun-Ying
    Kuo, Po-Hsiu
    Hsieh, Wei-Hsin
    Lin, Chun-Yen
    Yin, Hsiang-I
    Tseng, Li-Hui
    Lee, James Chun-I
    FORENSIC SCIENCE MEDICINE AND PATHOLOGY, 2017, 13 (02) : 177 - 187