Dimensionality Reduction for Cluster Identification in Metagenomics using Autoencoders

被引:2
|
作者
Maduranga, Uditha [1 ]
Wijegunarathna, Kalana [1 ]
Weerasinghe, Sadeep [1 ]
Perera, Indika [1 ]
Wickramarachchi, Anuradha [2 ]
机构
[1] Univ Moratuwa, Dept Comp Sci & Engn, Katubedda, Sri Lanka
[2] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT, Australia
关键词
metagenomics; metagenomic data visualizations; nonlinear dimensionality reduction; autoencoders; clustering;
D O I
10.1109/ICTer51097.2020.9325447
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Metagenomics is the study of the genomic content of the microbial organisms extracted from a sample in their natural habitats. These unknown collections of genomic data are analyzed without any prior lab-based cultivation to avoid amplification bias. One of the vital aspects of metagenomics analysis is the visualization of the information that is derived from the genomic sequences of a microbiome sample. In a successful visualization, the congruent reads of the sequences should appear in clusters depending on the diversity and taxonomy of the microorganisms in the sequenced sample. In converting higher dimensional sequence data into lower dimensional data for visualization purposes, preserving the genomic characteristics is given the highest priority. In this process, the demand for precise and efficient methods of dimensionality reduction is crucial. Currently, Principle Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE) are used for dimensionality reduction purposes in metagenomics, which are linear and non-linear techniques respectively. Although the above-mentioned techniques are widely used, there are shortcomings in accuracy and efficiency in terms of visualizations. In this paper, we explore the possibility of using autoencoders, a deep learning technique, to get a rich dimensionality reduction, overcoming the prevailing impediments of PCA and t-SNE and outperforming them to achieve better metagenomic visualizations.
引用
收藏
页码:113 / 118
页数:6
相关论文
共 50 条
  • [41] Dimensionality reduction using singular vectors
    Afshar, Majid
    Usefi, Hamid
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [42] Image retrieval using dimensionality reduction
    Lu, K
    He, XF
    Zeng, JZ
    [J]. COMPUTATIONAL AND INFORMATION SCIENCE, PROCEEDINGS, 2004, 3314 : 775 - 781
  • [43] Dimensionality reduction using elastic measures
    Tucker, J. Derek
    Martinez, Matthew T.
    Laborde, Jose M.
    [J]. STAT, 2023, 12 (01):
  • [44] Dimensionality Reduction using Symbolic Regression
    Icke, Ilknur
    Rosenberg, Andrew
    [J]. GECCO-2010 COMPANION PUBLICATION: PROCEEDINGS OF THE 12TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2010, : 2085 - 2086
  • [45] PROSPECTS FOR DIATOMS IDENTIFICATION USING METAGENOMICS: A REVIEW
    Alindonosi, A. R.
    Baeshen, M. N.
    Elsharawy, N. T.
    [J]. APPLIED ECOLOGY AND ENVIRONMENTAL RESEARCH, 2021, 19 (06): : 4281 - 4298
  • [46] NOVEL APPROACH FOR BIG DATA CLASSIFICATION BASED ON HYBRID PARALLEL DIMENSIONALITY REDUCTION USING SPARK CLUSTER
    Ali, Ahmed Hussein
    Abdullah, Mahmood Zaki
    [J]. COMPUTER SCIENCE-AGH, 2019, 20 (04): : 413 - 431
  • [47] PARAMETER IDENTIFICATION USING INTRINSIC DIMENSIONALITY
    TRUNK, GV
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1972, 18 (01) : 126 - +
  • [48] Revisiting Dimensionality Reduction Techniques for Visual Cluster Analysis: An Empirical Study
    Xia, Jiazhi
    Zhang, Yuchen
    Song, Jie
    Chen, Yang
    Wang, Yunhai
    Liu, Shixia
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (01) : 529 - 539
  • [49] Nonlinear system identification using modified variational autoencoders
    Paniagua, Jose L.
    Lopez, Jesus A.
    [J]. INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
  • [50] Structural Damage Identification Using Autoencoders: A Comparative Study
    Neto, Marcos Spinola
    Finotti, Rafaelle
    Barbosa, Flavio
    Cury, Alexandre
    [J]. BUILDINGS, 2024, 14 (07)