Genomic signal processing;
Discrete wavelet transform;
Cancer;
Support vector machine;
Gene sequence;
Differentiation;
Signal processing;
D O I:
10.1016/j.future.2018.12.041
中图分类号:
TP301 [理论、方法];
学科分类号:
081202 ;
摘要:
Missense mutations are the primary cause of cancer. Identification of mutation in gene sequences is the preliminary step in diagnosis of cancer. In order to identify mutation we need to differentiate between cancerous and non-cancerous gene sequences. Identification of mutation by sequence comparison method can only be possible if the existing variant repeats. If there are no homologous variants present, using a sequence identification method, it is difficult to distinguish cancerous and non-cancerous sequences. Here we have used DWT based Genomic Signal Processing techniques to identify a pattern in the characteristics of the sequences, which in turn can be used with machine learning algorithm to differentiate between cancerous and non-cancerous sequences. The cancerous and non-cancerous gene sequences for lung cancer, breast cancer and ovarian cancer are obtained from NCBI. After performing numerical mapping for the sequences, four level DWT is applied using Haar wavelet and statistical features like mean, median, standard deviation, inter quartile range, skewness and kurtosis are obtained from the wavelet domain. These statistical values when applied to machine learning algorithms resulted in the accuracy of 100% on classification of cancerous and non-cancerous sequences with Support Vector Machine. (C) 2019 Elsevier B.V. All rights reserved.
机构:
Univ Cent Marta Abreu Las Villas, Fac Ingn Elect, Ctr Estudios Elect & Tecnol Informac, Carretera Camajuani Km 5 1-2, Santa Clara 54830, Villa Clara, CubaUniv Cent Marta Abreu Las Villas, Fac Ingn Elect, Ctr Estudios Elect & Tecnol Informac, Carretera Camajuani Km 5 1-2, Santa Clara 54830, Villa Clara, Cuba
Lorenzo-Ginori, Juan V.
Rodriguez-Fuentes, Anibal
论文数: 0引用数: 0
h-index: 0
机构:
Univ Cent Marta Abreu Las Villas, Fac Ingn Elect, Ctr Estudios Elect & Tecnol Informac, Carretera Camajuani Km 5 1-2, Santa Clara 54830, Villa Clara, CubaUniv Cent Marta Abreu Las Villas, Fac Ingn Elect, Ctr Estudios Elect & Tecnol Informac, Carretera Camajuani Km 5 1-2, Santa Clara 54830, Villa Clara, Cuba
Rodriguez-Fuentes, Anibal
Grau Abalo, Ricardo
论文数: 0引用数: 0
h-index: 0
机构:
Univ Cent Marta Abreu Las Villas, Fac Ingn Elect, Ctr Estudios Informat, Santa Clara 54830, Villa Clara, CubaUniv Cent Marta Abreu Las Villas, Fac Ingn Elect, Ctr Estudios Elect & Tecnol Informac, Carretera Camajuani Km 5 1-2, Santa Clara 54830, Villa Clara, Cuba
Grau Abalo, Ricardo
Sanchez Rodriguez, Robersy
论文数: 0引用数: 0
h-index: 0
机构:
Inst Nacl Invest Viandas Trop, Biotechnol Grp, Santo Domingo, Villa Clara, CubaUniv Cent Marta Abreu Las Villas, Fac Ingn Elect, Ctr Estudios Elect & Tecnol Informac, Carretera Camajuani Km 5 1-2, Santa Clara 54830, Villa Clara, Cuba
机构:
Texas A&M Univ, College Stn, TX 77843 USA
Translat Genom Res Inst, Computat Biol Div, Phoenix, AZ USA
Univ Texas MD Anderson Canc Ctr, Dept Bioinformat & Computat Biol, Houston, TX 77030 USATexas A&M Univ, College Stn, TX 77843 USA