Automated detection of cancerous genomic sequences using genomic signal processing and machine learning

被引:7
|
作者
Liu, Dong-Wei [1 ]
Jia, Run-Ping [1 ]
Wang, Cai-Feng [1 ]
Arunkumar, N. [2 ]
Narasimhan, K. [2 ]
Udayakumar, M. [3 ]
Elamaran, V. [2 ]
机构
[1] Shanghai Inst Technol, Sch Mat Sci & Engn, Shanghai 201418, Peoples R China
[2] SASTRA Deemed Univ, Sch EEE, Thanjavur, India
[3] SASTRA Deemed Univ, Sch Chem & Biotechnol, Dept Bioinformat, Thanjavur, India
关键词
Genomic signal processing; Discrete wavelet transform; Cancer; Support vector machine; Gene sequence; Differentiation; Signal processing;
D O I
10.1016/j.future.2018.12.041
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Missense mutations are the primary cause of cancer. Identification of mutation in gene sequences is the preliminary step in diagnosis of cancer. In order to identify mutation we need to differentiate between cancerous and non-cancerous gene sequences. Identification of mutation by sequence comparison method can only be possible if the existing variant repeats. If there are no homologous variants present, using a sequence identification method, it is difficult to distinguish cancerous and non-cancerous sequences. Here we have used DWT based Genomic Signal Processing techniques to identify a pattern in the characteristics of the sequences, which in turn can be used with machine learning algorithm to differentiate between cancerous and non-cancerous sequences. The cancerous and non-cancerous gene sequences for lung cancer, breast cancer and ovarian cancer are obtained from NCBI. After performing numerical mapping for the sequences, four level DWT is applied using Haar wavelet and statistical features like mean, median, standard deviation, inter quartile range, skewness and kurtosis are obtained from the wavelet domain. These statistical values when applied to machine learning algorithms resulted in the accuracy of 100% on classification of cancerous and non-cancerous sequences with Support Vector Machine. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:233 / 237
页数:5
相关论文
共 50 条
  • [1] Automated detection of colon cancer using genomic signal processing
    Naeem, Safaa M.
    Mabrouk, Mai S.
    Eldosoky, Mohamed A.
    Sayed, Ahmed Y.
    [J]. EGYPTIAN JOURNAL OF MEDICAL HUMAN GENETICS, 2021, 22 (01)
  • [2] Automated detection of colon cancer using genomic signal processing
    Safaa M. Naeem
    Mai S. Mabrouk
    Mohamed A. Eldosoky
    Ahmed Y. Sayed
    [J]. Egyptian Journal of Medical Human Genetics, 22
  • [3] Machine learning for multimodality genomic signal processing
    Kung, SY
    Mak, MW
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2006, 23 (03) : 117 - 121
  • [4] Genomic Signatures: Machine Learning and Digital Signal Processing of Genomic Sequences Provides Ultrafast and Accurate Taxonomic Classification.
    Randhawa, G. S.
    Hill, K. A.
    Kari, L.
    [J]. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS, 2018, 59 : 104 - 104
  • [5] Digital Signal Processing in the Analysis of Genomic Sequences
    Lorenzo-Ginori, Juan V.
    Rodriguez-Fuentes, Anibal
    Grau Abalo, Ricardo
    Sanchez Rodriguez, Robersy
    [J]. CURRENT BIOINFORMATICS, 2009, 4 (01) : 28 - 40
  • [6] Improved Algorithm for the Detection of Cancerous Cells Using Discrete Wavelet Transformation of Genomic Sequences
    Mariapushpam, Inbamalar Tharcis
    Rajagopal, Sivakumar
    [J]. CURRENT BIOINFORMATICS, 2017, 12 (06) : 543 - 550
  • [7] Automated Genomic Signal Processing for Diseased Gene Identification
    Shen, Tao
    Nagai, Yukari
    Udayakumar, M.
    Narasimhan, K.
    Shriram, R. K. Arvind
    Mohanraj, N.
    Elamaran, V
    [J]. JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2019, 9 (06) : 1254 - 1261
  • [8] Genomic signal processing
    Anastassiou, D
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (04) : 8 - 20
  • [9] Genomic Signal Processing
    Dougherty, Edward R.
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (03) : 124 - 129
  • [10] Genomic signal processing
    Astola, J
    Dougherty, E
    Shmulevich, I
    Tabus, I
    [J]. SIGNAL PROCESSING, 2003, 83 (04) : 691 - 694