Unsupervised Machine Learning Techniques to Categorize Genomic Islands

被引:0
|
作者
Ghaffari, Noushin [1 ]
Zhou, Lijie [1 ]
Nazara, Rabeya [1 ]
Mageeney, Catherine M. [2 ]
Williams, Kelly P. [2 ]
机构
[1] Prairie View A&M Univ, Dept Comp Sci, Roy G Perry Coll Engn, Prairie View, TX 77446 USA
[2] Sandia Natl Labs, Livermore, CA 94550 USA
基金
美国国家科学基金会;
关键词
bioinformatics; unsupervised learning; phylogenic trees; genomic islands; high-performance computing;
D O I
10.1109/eIT60633.2024.10609858
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
unsupervised Machine Learning (ML) techniques are powerful tools in identifying similarity patterns and can be utilized to categorize the data into related groups. This study showcases the applications of unsupervised methods, namely hierarchical clustering to carefully determine related groups of newly identified genomic islands. Genomic islands (GIs) are mobile genetic elements integrated into bacterial chromosomes. GIs can impact evolution of bacteria for example by carrying virulence or metabolic genes [1]. Precisely identifying GIs calls for a sophisticated process which we have recently implemented as an already published tool called TIGER which stands for Target / Integrative Genetic Element Retriever (TIGER). TIGER identifies mobile DNAs in each genome and identifies genomic islands with high accuracy [2]. We have employed TIGER to identify approximately 130,000 GIs in E. coli bacteria. To identify the similarities among the E. coli GIs, the hierarchical clustering algorithms and our heuristics have successfully categorized the data into relevant groups. We have been able to identify related groups of GIs as well as singleton GIs. Our results provide a promising method for categorizing large DNA segments that can be compared using a similarity measure and be categorized into more precise clusters for further analysis.
引用
收藏
页码:504 / 507
页数:4
相关论文
共 50 条
  • [1] APPLICATION OF UNSUPERVISED MACHINE LEARNING APPROACH TO CATEGORIZE PATIENTS WITH TRAUMATIC SPINAL CORD INJURY
    Basiratzadeh, Shahin
    Hakimjavadi, Ramtin
    Michalowski, Wojtek
    Viktor, Herna
    Baddour, Natalie
    Wai, Eugene
    Stratton, Alexandra
    Kingwell, Stephen
    Tsai, Eve
    Phan, Philippe
    [J]. JOURNAL OF NEUROTRAUMA, 2022, 39 (11-12) : A127 - A127
  • [2] Analysis of Unsupervised Machine Learning Techniques for Customer Segmentation
    Katyayan, Anant
    Bokhare, Anuja
    Gupta, Rajat
    Kumari, Sushmita
    Pardeshi, Twinkle
    [J]. MACHINE LEARNING AND AUTONOMOUS SYSTEMS, 2022, 269 : 483 - 498
  • [3] A primer on machine learning techniques for genomic applications
    Monaco, Alfonso
    Pantaleo, Ester
    Amoroso, Nicola
    Lacalamita, Antonio
    Lo Giudice, Claudio
    Fonzino, Adriano
    Fosso, Bruno
    Picardi, Ernesto
    Tangaro, Sabina
    Pesole, Graziano
    Bellotti, Roberto
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 4345 - 4359
  • [4] Resolving the structural features of genomic islands: A machine learning approach
    Vernikos, Georgios S.
    Parkhill, Julian
    [J]. GENOME RESEARCH, 2008, 18 (02) : 331 - 342
  • [5] Animal Behavior Analysis Using Unsupervised Machine Learning Techniques
    Liu, Jiefei
    Bailey, Derek W.
    Cao, Huiping
    Son, Tran Cao
    Tobin, Colin T.
    [J]. JOURNAL OF ANIMAL SCIENCE, 2023, 101
  • [6] Animal Behavior Analysis Using Unsupervised Machine Learning Techniques
    Liu, Jiefei
    Bailey, Derek W.
    Cao, Huiping
    Son, Tran Cao
    Tobin, Colin T.
    [J]. JOURNAL OF ANIMAL SCIENCE, 2023, 101 : 2 - 2
  • [7] Missing value imputation using unsupervised machine learning techniques
    P. S. Raja
    K. Thangavel
    [J]. Soft Computing, 2020, 24 : 4361 - 4392
  • [8] Missing value imputation using unsupervised machine learning techniques
    Raja, P. S.
    Thangavel, K.
    [J]. SOFT COMPUTING, 2020, 24 (06) : 4361 - 4392
  • [9] Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges
    Usama, Muhammad
    Qadir, Junaid
    Raza, Aunn
    Arif, Hunain
    Yau, Kok-Lim Alvin
    Elkhatib, Yehia
    Hussain, Amir
    Al-Fuqaha, Ala
    [J]. IEEE ACCESS, 2019, 7 : 65579 - 65615
  • [10] Unsupervised machine learning techniques to prevent faults in railroad switch machines
    Soares, Nielson
    de Aguiar, Eduardo Pestana
    Souza, Amanda Campos
    Goliatt, Leonardo
    [J]. INTERNATIONAL JOURNAL OF CRITICAL INFRASTRUCTURE PROTECTION, 2021, 33