Hierarchical classification of microorganisms based on high-dimensional phenotypic data

被引:19
|
作者
Tafintseva, Valeria [1 ]
Vigneau, Evelyne [2 ]
Shapaval, Volha [1 ]
Cariou, Veronique [2 ]
Qannari, El Mostafa [2 ]
Kohler, Achim [1 ]
机构
[1] Norwegian Univ Life Sci, Fac Sci & Technol, N-1432 As, Norway
[2] INRA, Oniris, StatSC, Nantes, France
关键词
classification analysis; FTIR spectroscopy of microorganisms; hierarchical tree structure; TRANSFORM INFRARED-SPECTROSCOPY; PARTIAL LEAST-SQUARES; FT-IR SPECTROSCOPY; RAPID IDENTIFICATION; VARIABLE SELECTION; BACTERIA; DIFFERENTIATION; SPARSE; TOOL;
D O I
10.1002/jbio.201700047
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The classification of microorganisms by high-dimensional phenotyping methods such as FTIR spectroscopy is often a complicated process due to the complexity of microbial phylogenetic taxonomy. A hierarchical structure developed for such data can often facilitate the classification analysis. The hierarchical tree structure can either be imposed to a given set of phenotypic data by integrating the phylogenetic taxonomic structure or set up by revealing the inherent clusters in the phenotypic data. In this study, we wanted to compare different approaches to hierarchical classification of microorganisms based on high-dimensional phenotypic data. A set of 19 different species of molds (filamentous fungi) obtained from the mycological strain collection of the Norwegian Veterinary Institute (Oslo, Norway) is used for the study. Hierarchical cluster analysis is performed for setting up the classification trees. Classification algorithms such as artificial neural networks (ANN), partial least-squared discriminant analysis and random forest (RF) are used and compared. The 2 methods ANN and RF outperformed all the other approaches even though they did not utilize predefined hierarchical structure. To our knowledge, the RF approach is used here for the first time to classify microorganisms by FTIR spectroscopy.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A Hierarchical Gamma Mixture Model-Based Method for Classification of High-Dimensional Data
    Azhar, Muhammad
    Li, Mark Junjie
    Huang, Joshua Zhexue
    [J]. ENTROPY, 2019, 21 (09)
  • [2] A classification algorithm for high-dimensional data
    Roy, Asim
    [J]. INNS CONFERENCE ON BIG DATA 2015 PROGRAM, 2015, 53 : 345 - 355
  • [3] Classification methods for high-dimensional genetic data
    Kalina, Jan
    [J]. BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2014, 34 (01) : 10 - 18
  • [4] Enhanced algorithm for high-dimensional data classification
    Wang, Xiaoming
    Wang, Shitong
    [J]. APPLIED SOFT COMPUTING, 2016, 40 : 1 - 9
  • [5] Online Nonlinear Classification for High-Dimensional Data
    Vanli, N. Denizcan
    Ozkan, Huseyin
    Delibalta, Ibrahim
    Kozat, Suleyman S.
    [J]. 2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 685 - 688
  • [6] A training algorithm for classification of high-dimensional data
    Vieira, A
    Barradas, N
    [J]. NEUROCOMPUTING, 2003, 50 : 461 - 472
  • [7] A Compressive Classification Framework for High-Dimensional Data
    Tabassum, Muhammad Naveed
    Ollila, Esa
    [J]. IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2020, 1 : 177 - 186
  • [8] Ensemble Method for Classification of High-Dimensional Data
    Piao, Yongjun
    Park, Hyun Woo
    Jin, Cheng Hao
    Ryu, Keun Ho
    [J]. 2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 245 - +
  • [9] Maximally Informative Hierarchical Representations of High-Dimensional Data
    Ver Steeg, Greg
    Galstyan, Aram
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 1004 - 1012
  • [10] High-Dimensional Data Classification Based on Smooth Support Vector Machines
    Purnami, Santi Wulan
    Andari, Shofi
    Pertiwi, Yuniati Dian
    [J]. THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015, 2015, 72 : 477 - 484