Improved Hybrid Approach for Enhancing Protein Coding Regions Identification in DNA Sequences

被引:0
|
作者
Hassan, Emad S. [1 ,2 ]
Dessouky, Ahmed M. [3 ]
Fathi, Hesham [3 ,4 ]
Salama, Gerges M. [4 ]
Oshaba, Ahmed S. [1 ]
El-Emary, Atef [1 ]
Abd El-Samie, Fathi E. [2 ,5 ]
机构
[1] Jazan Univ, Coll Engn, Dept Elect Engn, Jizan 45142, Saudi Arabia
[2] Menoufia Univ, Fac Elect Engn, Dept Elect & Elect Commun, Menoufia 32952, Egypt
[3] Egyptian Russian Univ, Fac Artificial Intelligence, Cairo, Egypt
[4] Minia Univ, Fac Engn, Dept Elect Engn Elect & Commun Engn, Al Minya, Egypt
[5] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, Riyadh, Saudi Arabia
关键词
Bioinformatics; Protein coding regions; Digital signal processing; Wavelet transforms; Sequence analysis; Spectral estimation; NEURAL-NETWORK; PREDICTION;
D O I
10.2174/0115748936287244240117065325
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Introduction Identifying and predicting protein-coding regions within DNA sequences play a pivotal role in genomic research. This paper introduces an approach for identifying protein-coding regions in DNA sequences, employing a hybrid methodology that combines a digital bandpass filter with wavelet transforms and various spectral estimation techniques to enhance exon prediction. Specifically, the Haar and Daubechies wavelet transforms are applied to improve the accuracy of protein-coding region (exon) prediction, enabling the extraction of intricate details that may be obscured in the original DNA sequences.Methods This research showcases the utility of Haar and Daubechies wavelet transforms, both non-parametric and parametric spectral estimation methods, and the deployment of a digital band pass filter for detecting peaks in exon regions. Additionally, the application of the Electron-Ion Interaction Potential (EIIP) method for converting symbolic DNA sequences into numerical values and the utilization of sum-of-sinusoids (SoS) mathematical models with optimized parameters further enrich the toolbox for DNA sequence analysis, ensuring the success of this proposed method in modeling DNA sequences optimally and accurately identifying genes.Results The outcomes of this approach showcase a substantial enhancement in identification accuracy for protein-coding regions. In terms of peak location detection, the application of Haar and Daubechies wavelet transforms enhances the accuracy of peak localization by approximately (0.01, 3-5 dB). When employing non-parametric and parametric spectral estimation techniques, there is an improvement in peak location by approximately (0.01, 4 dB) compared to the original signal. The proposed approach also achieves higher accuracy when compared with existing methods.Conclusion These findings not only bridge gaps in DNA sequence analysis but also offer a promising pathway for advancing exonic region prediction and gene identification in genomics research. The hybrid methodology presented stands as a robust contribution to the evolving landscape of genomic analysis techniques.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Fourier-Based Filtering Approach for Identification of Protein-Coding Regions in DNA Sequences
    Das, Bihter
    Turkoglu, Ibrahim
    [J]. 2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 2529 - 2532
  • [2] Identification of Protein-Coding Regions in DNA Sequences Using A Time-Frequency Filtering Approach
    Sitanshu Sekhar Sahu
    Ganapati Panda
    [J]. Genomics,Proteomics & Bioinformatics, 2011, (Z1) : 45 - 55
  • [3] RECOGNITION OF PROTEIN CODING REGIONS IN DNA-SEQUENCES
    FICKETT, JW
    [J]. NUCLEIC ACIDS RESEARCH, 1982, 10 (17) : 5303 - 5318
  • [4] CORRELATION APPROACH TO IDENTIFY CODING REGIONS IN DNA-SEQUENCES
    OSSADNIK, SM
    BULDYREV, SV
    GOLDBERGER, AL
    HAVLIN, S
    MANTEGNA, RN
    PENG, CK
    SIMONS, M
    STANLEY, HE
    [J]. BIOPHYSICAL JOURNAL, 1994, 67 (01) : 64 - 70
  • [5] IDENTIFICATION OF PROTEIN-CODING REGIONS IN GENOMIC DNA
    SNYDER, EE
    STORMO, GD
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1995, 248 (01) : 1 - 18
  • [6] Prediction of protein-coding regions in DNA sequences using a model-based approach
    Kakumani, Rajasekhar
    Devabhaktuni, Vijay
    Ahmad, M. Omair
    [J]. PROCEEDINGS OF 2008 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-10, 2008, : 1918 - 1921
  • [7] Identification of Coding Regions in Prokaryotic DNA Sequences Using Bayesian Classification
    Al Bataineh, Mohammad
    [J]. BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2020), 2020, 12108 : 3 - 14
  • [8] Identification of Protein Coding Regions in the Eukaryotic DNA Sequences Based on Marple Algorithm and Wavelet Packets Transform
    Liu, Guangchen
    Luan, Yihui
    [J]. ABSTRACT AND APPLIED ANALYSIS, 2014,
  • [9] STATISTICAL-ANALYSIS FOR THE PREDICTION OF PROTEIN CODING REGIONS IN DNA-SEQUENCES
    NAKATA, K
    KANEHISA, M
    DELISI, C
    [J]. BIOPHYSICAL JOURNAL, 1985, 47 (02) : A224 - A224
  • [10] Prediction of protein coding regions in DNA sequences using signal processing methods
    Saberkari, Hamidreza
    Shamsi, Mousa
    Sedaaghi, MohammadHossein
    Golabi, Faegheh
    [J]. 2012 IEEE SYMPOSIUM ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ISIEA 2012), 2012,