A modified Markov clustering approach to unsupervised classification of protein sequences

被引:10
|
作者
Szilagyi, Laszlo [1 ,2 ]
Medves, Lehel [1 ]
Szilagyi, Sandor M. [1 ]
机构
[1] Sapientia Univ Transylvania, Fac Tech & Human Sci, Corunca 547367, Romania
[2] Budapest Univ Technol & Econ, Dept Control Engn & Informat Technol, H-1117 Budapest, Hungary
关键词
Markov clustering; Bioinformatics; Protein sequence classification; Unsupervised classification; Sparse matrix; STRUCTURAL CLASSIFICATION; DATABASE; SCOP; ALGORITHM; EVOLUTION; SEARCH; MODELS;
D O I
10.1016/j.neucom.2010.02.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose a modified Markov clustering algorithm for efficient and accurate clustering of large protein sequence databases, based on previously evaluated sequence similarity criteria. The proposed modification consists in an exponentially decreasing inflation rate, which aims at helping the quick creation of the hard structure of clusters by using a strong inflation in the beginning, and at producing fine partitions with a weaker inflation thereafter. The algorithm, which was tested and validated using the whole SCOP95 database, or randomly selected 10-50% sections, generally converges within 12-14 iteration cycles and provides clusters of high quality. Furthermore, a novel generalized formula for the inflation operation is given, and an efficient matrix symmetrization technique is presented, in order to improve the partition quality with relatively low amount of extra computations. Finally, an extra speedup is achieved via excluding isolated proteins from further processing. The proposed method performs better than previous solutions, from the point of view of partition quality, and computational load as well. (C) 2010 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:2332 / 2345
页数:14
相关论文
共 50 条
  • [1] A Modified Markov Clustering Approach for Protein Sequence Clustering
    Medves, Lehel
    Szilagyi, Laszlo
    Szilagyi, Sandor M.
    PATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS, 2008, 5265 : 110 - 120
  • [2] New Clustering Approach for Protein Sequences
    Mhamdi, Faouzi
    Ouerfelli, Achref
    2015 26TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2015, : 43 - 47
  • [3] n-Gram-based classification and unsupervised hierarchical clustering of genome sequences
    Tomovic, A
    Janicic, P
    Keselj, V
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2006, 81 (02) : 137 - 153
  • [4] Unsupervised protein sequences clustering algorithm using functional domain information
    Chen, Wei-Bang
    Zhang, Chengcui
    Zhong, Hua
    PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 76 - 81
  • [5] Fuzzy clustering approach in unsupervised sea-ice classification
    Eom, KB
    NEUROCOMPUTING, 1999, 25 (1-3) : 149 - 166
  • [6] Hierarchical clustering approach for unsupervised image classification of hyperspectral data
    Lee, S
    Crawford, MM
    IGARSS 2004: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM PROCEEDINGS, VOLS 1-7: SCIENCE FOR SOCIETY: EXPLORING AND MANAGING A CHANGING PLANET, 2004, : 941 - 944
  • [7] Classification of Photovoltaic Failures with Hidden Markov Modeling, an Unsupervised Statistical Approach
    Hopwood, Michael
    Patel, Lekha
    Gunda, Thushara
    ENERGIES, 2022, 15 (14)
  • [8] An unsupervised anomaly detection approach using subtractive clustering and Hidden Markov Model
    Yang, Chun
    Deng, Feiqi
    Yang, Haidong
    2007 SECOND INTERNATIONAL CONFERENCE IN COMMUNICATIONS AND NETWORKING IN CHINA, VOLS 1 AND 2, 2007, : 123 - 126
  • [9] Protein Sequences Clustering of Herpes Virus by Using Tribe Markov Clustering (Tribe-MCL)
    Bustamam, A.
    Siswantining, T.
    Febriyani, N. L.
    Novitasari, I. D.
    Cahyaningrum, R. D.
    INTERNATIONAL SYMPOSIUM ON CURRENT PROGRESS IN MATHEMATICS AND SCIENCES 2016 (ISCPMS 2016), 2017, 1862
  • [10] ParSymG: a parallel clustering approach for unsupervised classification of remotely sensed imagery
    Du, Zhenhong
    Gu, Yuhua
    Zhang, Chuanrong
    Zhang, Feng
    Liu, Renyi
    Sequeira, Jean
    Li, Weidong
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2017, 10 (05) : 471 - 489