ExomeHMM: A Hidden Markov Model for Detecting Copy Number Variation Using Whole-Exome Sequencing Data

被引:5
|
作者
Guo, Cheng [1 ]
Yu, Zhenhua [1 ]
Wang, Minghui [1 ,2 ]
Li, Ao [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Anhui, Peoples R China
[2] Univ Sci & Technol China, Res Ctr Biomed Engn, Hefei, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Copy number variation; expectation-maximization algorithm; hidden Markov model; next generation sequencing; viterbi algorithm; whole-exome sequencing; BREAST-CANCER; ACCURATE DETECTION; ANALYSIS TOOLKIT; HETEROZYGOSITY; VARIANTS; IDENTIFICATION; EXPRESSION; CAPTURE; DISEASE;
D O I
10.2174/1574893611666160727160757
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Copy number variations (CNVs), including amplification and deletion, are alterations of DNA copy number compared to a reference genome. CNVs play a crucial role in tumourigenesis and progression, including amplification of oncogenes and deletion of tumor suppressor genes that may significantly increase the risk of cancer. CNVs are also reported to be closely related with non-cancer diseases, such as Down syndrome, Parkinson disease, and Alzheimer disease. Objective: Whole-exome sequencing (WES) has been successfully applied to the discovery of gene mutations as well as clinical diagnosis. But it is quite challenging to evaluate the copy number using WES data due to read depth bias, exons' distribution pattern and normal cell contamination. Our aim is develop an efficient method to overcome these challenges and detect CNVs using WES data. Method: In this study, we present ExomeHMM, a hidden Markov model (HMM) based CNV detecting algorithm. ExomeHMM exploits relative read depth, a ratio based signal, to mitigate read depth distortion and employs exponential attenuated transition matrix to handle sparsely and non-uniformly distributed exons. Expectation-maximization algorithm is used to optimize parameters for the proposed model. Finally, we use standard Viterbi algorithm to infer the copy number of exons. Results: Using previously identified CNVs in 1000 Genome Project data as golden standard, ExomeHMM achieves the highest F-score among the four methods compared in this study. When applied to triple-negative breast cancer data, ExomeHMM is capable to find abnormal genes that are significantly associated with breast cancer. Conclusion: In conclusion, ExomeHMM is a suitable tool for CNV detections in both healthy samples as well as clinic tumor samples on whole-exome sequencing data.
引用
收藏
页码:147 / 155
页数:9
相关论文
共 50 条
  • [1] Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an ‘exome-first’ approach
    Satoko Miyatake
    Eriko Koshimizu
    Atsushi Fujita
    Ryoko Fukai
    Eri Imagawa
    Chihiro Ohba
    Ichiro Kuki
    Megumi Nukui
    Atsushi Araki
    Yoshio Makita
    Tsutomu Ogata
    Mitsuko Nakashima
    Yoshinori Tsurusaki
    Noriko Miyake
    Hirotomo Saitsu
    Naomichi Matsumoto
    [J]. Journal of Human Genetics, 2015, 60 : 175 - 182
  • [2] Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an 'exome-first' approach
    Miyatake, Satoko
    Koshimizu, Eriko
    Fujita, Atsushi
    Fukai, Ryoko
    Imagawa, Eri
    Ohba, Chihiro
    Kuki, Ichiro
    Nukui, Megumi
    Araki, Atsushi
    Makita, Yoshio
    Ogata, Tsutomu
    Nakashima, Mitsuko
    Tsurusaki, Yoshinori
    Miyake, Noriko
    Saitsu, Hirotomo
    Matsumoto, Naomichi
    [J]. JOURNAL OF HUMAN GENETICS, 2015, 60 (04) : 175 - 182
  • [3] Identification of copy number variations from whole-exome sequencing using eXome Hidden Markov Model (XHMM): A FRENCH experience
    Tisserand, E.
    Thevenon, J.
    Bruel, A.
    Sorlin, A.
    Assoun, M.
    Marle, N.
    Carmignac, V.
    Nambot, S.
    Lefebvre, M.
    Vitobello, A.
    Lehalle, D.
    Tranmauthen, F.
    Philippe, C.
    Kuentz, P.
    Poulleau, M.
    Jouan, T.
    Poe, C.
    Thauvin-Robinet, C.
    Faivre, L.
    Mosca-Boidron, A.
    Duffourd, Y.
    Callier, P.
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 : 330 - 330
  • [4] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Alberto Magi
    Lorenzo Tattini
    Ingrid Cifola
    Romina D’Aurizio
    Matteo Benelli
    Eleonora Mangano
    Cristina Battaglia
    Elena Bonora
    Ants Kurg
    Marco Seri
    Pamela Magini
    Betti Giusti
    Giovanni Romeo
    Tommaso Pippucci
    Gianluca De Bellis
    Rosanna Abbate
    Gian Franco Gensini
    [J]. Genome Biology, 14
  • [5] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Magi, Alberto
    Tattini, Lorenzo
    Cifola, Ingrid
    D'Aurizio, Romina
    Benelli, Matteo
    Mangano, Eleonora
    Battaglia, Cristina
    Bonora, Elena
    Kurg, Ants
    Seri, Marco
    Magini, Pamela
    Giusti, Betti
    Romeo, Giovanni
    Pippucci, Tommaso
    De Bellis, Gianluca
    Abbate, Rosanna
    Gensini, Gian Franco
    [J]. GENOME BIOLOGY, 2013, 14 (10):
  • [6] ERDS-pe: a paired hidden Markov model for copy number variant detection from whole-exome sequencing data
    Tan, Renjie
    Wang, Jixuan
    Wu, Xiaoliang
    Wan, Guoqiang
    Wang, Rongjie
    Ma, Rui
    Han, Zhijie
    Zhou, Wenyang
    Jin, Shuilin
    Jiang, Qinghua
    Wang, Yadong
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 141 - 144
  • [7] An Evaluation of Copy Number Variation Detection Tools from Whole-Exome Sequencing Data
    Tan, Renjie
    Wang, Yadong
    Kleinstein, Sarah E.
    Liu, Yongzhuang
    Zhu, Xiaolin
    Guo, Hongzhe
    Jiang, Qinghua
    Allen, Andrew S.
    Zhu, Mingfu
    [J]. HUMAN MUTATION, 2014, 35 (07) : 899 - 907
  • [8] Platform comparison of detecting copy number variants with microarrays and whole-exome sequencing
    de Ligt, Joep
    Boone, Philip M.
    Pfundt, Rolph
    Vissers, Lisenka E. L. M.
    de Leeuw, Nicole
    Shaw, Christine
    Brunner, Han G.
    Lupski, James R.
    Veltman, Joris A.
    Hehir-Kwa, Jayne Y.
    [J]. GENOMICS DATA, 2014, 2 : 144 - 146
  • [9] Combinatorial approach to estimate copy number genotype using whole-exome sequencing data
    Hwang, Mi Yeong
    Moon, Sanghoon
    Heo, Lyong
    Kim, Young Jin
    Oh, Ji Hee
    Kim, Yeon-Jung
    Kim, Yun Kyoung
    Lee, Juyoung
    Han, Bok-Ghee
    Kim, Bong-Jo
    [J]. GENOMICS, 2015, 105 (03) : 145 - 149
  • [10] CODEX: a normalization and copy number variation detection method for whole-exome sequencing
    Jiang, Yuchao
    Oldridge, Derek A.
    Diskin, Sharon J.
    Zhang, Nancy R.
    [J]. CANCER RESEARCH, 2015, 75