Classification of DNA Sequence Using Machine Learning

被引:0
|
作者
Kanumalli, Satya Sandeep [1 ]
Swathi, S. [1 ]
Sukanya, K. [1 ]
Yamini, V. [1 ]
Nagalakshmi, N. [1 ]
机构
[1] Vignans Nirula Inst Technol & Sci Women, CSE Dept, Guntur, Andhra Pradesh, India
关键词
Machine learning; DNA sequencing; AdaBoost algorithm; Bioinformatics;
D O I
10.1007/978-981-19-3590-9_57
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the field of medical information research, the genetic series is widely used as a component of a category. One of the applications of ML is biochemistry. Bioinformatics is an interdisciplinary science that uses computers and communication science to understand biological data. One of its most difficult tasks is to distinguish between regular genes and disease-causing genes. The classification of gene sequences into existing categories is utilized in genomic research to discover the functions of novel proteins. As a result, it is critical to identify and categorize such genes. We employ ML approaches to distinguish between infected and normal genes using classification methods. AdaBoost has a high degree of precision; relative to the bagging algorithm and Random Forest Algorithm, AdaBoost fully considers the weight of each classifier. To generate a sequence of weak classifiers, an AdaBoost-based learning approach is used to find the most 'informative' or 'discriminating' features. The identification cascade structure can also help to limit false-positive results. This study provides an overview of the mechanics of gene sequence classification using ML Techniques, including a brief introduction to bioinformatics and important challenges in DNA Sequencing with ML.
引用
收藏
页码:723 / 732
页数:10
相关论文
共 50 条
  • [1] Protein sequence classification using extreme learning machine
    Wang, DH
    Huang, GB
    Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vols 1-5, 2005, : 1406 - 1411
  • [2] Tsetlin Machine in DNA sequence classification
    Liland, Kristian Hovde
    Tomic, Oliver
    Indahl, Ulf Geir
    Futsaether, Cecilia Marie
    Jiao, Lei
    Granmo, Ole-Christoffer
    Snipen, Lars Gustav
    2023 INTERNATIONAL SYMPOSIUM ON THE TSETLIN MACHINE, ISTM, 2023,
  • [3] Extreme learning machine for time sequence classification
    Liu, Huaping
    Yu, Lianzhi
    Wang, Wen
    Sun, Fuchun
    NEUROCOMPUTING, 2016, 174 : 322 - 330
  • [4] Significance of Sequence Features in Classification of Protein–Protein Interactions Using Machine Learning
    Sini S. Raj
    S. S. Vinod Chandra
    The Protein Journal, 2024, 43 : 72 - 83
  • [5] Virus2Vec: Viral Sequence Classification Using Machine Learning
    Ali, Sarwan
    Bello, Babatunde
    Chourasia, Prakash
    Punathil, Ria Thazhe
    Chen, Pin-Yu
    Khan, Imdad Ullah
    Patterson, Murray
    CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 209, 2023, 209 : 6 - 18
  • [6] Deep Learning Architectures for DNA Sequence Classification
    Lo Bosco, Giosue
    Di Gangi, Mattia Antonino
    FUZZY LOGIC AND SOFT COMPUTING APPLICATIONS, WILF 2016, 2017, 10147 : 162 - 171
  • [7] DNA sequence classification using DAWGs
    Levy, S
    Stormo, GD
    STRUCTURES IN LOGIC AND COMPUTER SCIENCE, 1997, 1261 : 339 - 352
  • [8] A Deep Learning Approach for Viral DNA Sequence Classification using Genetic Algorithm
    El-Tohamy, Ahmed
    Maghwary, Huda Amin
    Badr, Nagwa
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 530 - 538
  • [9] Significance of Sequence Features in Classification of Protein-Protein Interactions Using Machine Learning
    Raj, Sini S.
    Chandra, S. S. Vinod
    PROTEIN JOURNAL, 2024, 43 (01): : 72 - 83
  • [10] A machine learning approach for accelerating DNA sequence analysis
    Memeti, Suejb
    Pllana, Sabri
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2018, 32 (03): : 363 - 379