Lung Cancer Classification Models Using Discriminant Information of Mutated Genes in Protein Amino Acids Sequences

被引:5
|
作者
Sattar, Mohsin [1 ]
Majid, Abdul [1 ]
机构
[1] Pakistan Inst Engn & Appl Sci, Dept Comp & Informat Sci, Biomed Informat Res Lab, PO Nilore, Islamabad, Pakistan
关键词
Lung cancer; Amino acids; Classification; Diagnosis; Prognosis; Machine learning; PREDICTION; MUTATIONS; RISK;
D O I
10.1007/s13369-018-3468-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Lung cancer is a heterogeneous disease based on uncontrollable growth of cells. Lung cancer is major cause of cancer-related deaths. Early diagnosis of lung cancer is important for its treatment and survival of patients. In this study, through the statistical analysis of cancerous proteins sequences, we observed the mutated genes associated with etiology of lung cancer. Our analysis revealed most frequent mutated genes TP53, EGFR, KMT2D, PDE4DIP, ATM, ZNF521, DICER1, CTNNB1 RUNX1T1, SMARCA4, FBXW7, NF1, PIK3CA, STK11, NTRk3, APC, PTPRB, BRCA2, MYH11 and AMER1. We observed abnormal mutations in genes contributed toward variations in the composition of amino acid sequences. This variation was described in various feature spaces using statistical and physicochemical properties of amino acids. These influential features have provided sufficient discrimination power for the development of effective lung cancer classification models (LCCMs). The main advantage of proposed novel approach is the effective utilization of the discriminant information of mutated genes. Experimental results showed that SVM model has the best performance in split amino acid composition. In the study, we explored a new dimension of early lung cancer classification using discriminant information of mutated genes revealed through the statistical analysis of the mutated genes. It is anticipated that the proposed approach would be useful for practitioners and domain experts for early lung cancer diagnosis and prognosis.
引用
收藏
页码:3197 / 3211
页数:15
相关论文
共 47 条
  • [1] Lung Cancer Classification Models Using Discriminant Information of Mutated Genes in Protein Amino Acids Sequences
    Mohsin Sattar
    Abdul Majid
    [J]. Arabian Journal for Science and Engineering, 2019, 44 : 3197 - 3211
  • [2] An information-theoretic classification of amino acids for the assessment of interfaces in protein–protein docking
    Christophe Jardin
    Arno G. Stefani
    Martin Eberhardt
    Johannes B. Huber
    Heinrich Sticht
    [J]. Journal of Molecular Modeling, 2013, 19 : 3901 - 3910
  • [3] Lung cancer type classification using differentiator genes
    Ramroach, Sterling
    John, Melford
    Joshi, Ajay
    [J]. GENE REPORTS, 2020, 19
  • [4] An information-theoretic classification of amino acids for the assessment of interfaces in protein-protein docking
    Jardin, Christophe
    Stefani, Arno G.
    Eberhardt, Martin
    Huber, Johannes B.
    Sticht, Heinrich
    [J]. JOURNAL OF MOLECULAR MODELING, 2013, 19 (09) : 3901 - 3910
  • [5] Classification of amino-acid sequences using state-space models
    Statistik, F
    [J]. BETWEEN DATA SCIENCE AND APPLIED DATA ANALYSIS, 2003, : 615 - 623
  • [6] Information-theoretic analysis of protein sequences shows that amino acids self-cluster
    Cai, Y
    Dodson, CTJ
    Doig, AJ
    Wolkenhauer, O
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2002, 218 (04) : 409 - 418
  • [7] Cancer disease multinomial classification using transfer learning and SVM on the genes’ sequences
    Slimene, Ines
    Messaoudi, Imen
    Oueslati, Afef Elloumi
    Lachiri, Zied
    [J]. EAI Endorsed Transactions on Pervasive Health and Technology, 2023, 9 (01)
  • [8] Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids
    Das, Jayanta Kumar
    Das, Provas
    Ray, Korak Kumar
    Choudhury, Pabitra Pal
    Jana, Siddhartha Sankar
    [J]. PLOS ONE, 2016, 11 (12):
  • [9] Simplified Protein Models: Predicting Folding Pathways and Structure Using Amino Acid Sequences
    Adhikari, Aashish N.
    Freed, Karl F.
    Sosnick, Tobin R.
    [J]. PHYSICAL REVIEW LETTERS, 2013, 111 (02)
  • [10] Prediction of Protein-Protein Interactions from Sequences using a Correlation Matrix of the Physicochemical Properties of Amino Acids
    Kopoin, Charlemagne N'Diffon
    Atiampo, Armand Kodjo
    N'Guessan, Behou Gerard
    Babri, Michel
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (03): : 41 - 47