Voting-based consensus clustering for combining multiple clusterings of chemical structures

被引:23
|
作者
Saeed, Faisal [1 ,2 ]
Salim, Naomie [1 ]
Abdo, Ammar [3 ,4 ,5 ]
机构
[1] Univ Technol Malaysia, Fac Comp Sci & Informat Syst, Johor Baharu, Malaysia
[2] Sanhan Community Coll, Informat Technol Dept, Sanaa, Yemen
[3] Alhodaida Univ, Dept Comp Sci, Alhodaida, Yemen
[4] Univ Lille 1, LIFL UMR CNRS 8022, F-59655 Villeneuve Dascq, France
[5] INRIA Lille Nord Europe, F-59655 Villeneuve Dascq, France
来源
关键词
DATA FUSION; RECEPTOR-BINDING; SIMILARITY; CLASSIFICATIONS; COEFFICIENTS; DESCRIPTORS; COMBINATION; SELECTION;
D O I
10.1186/1758-2946-4-37
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Background: Although many consensus clustering methods have been successfully used for combining multiple classifiers in many areas such as machine learning, applied statistics, pattern recognition and bioinformatics, few consensus clustering methods have been applied for combining multiple clusterings of chemical structures. It is known that any individual clustering method will not always give the best results for all types of applications. So, in this paper, three voting and graph-based consensus clusterings were used for combining multiple clusterings of chemical structures to enhance the ability of separating biologically active molecules from inactive ones in each cluster. Results: The cumulative voting-based aggregation algorithm (CVAA), cluster-based similarity partitioning algorithm (CSPA) and hyper-graph partitioning algorithm (HGPA) were examined. The F-measure and Quality Partition Index method (QPI) were used to evaluate the clusterings and the results were compared to the Ward's clustering method. The MDL Drug Data Report (MDDR) dataset was used for experiments and was represented by two 2D fingerprints, ALOGP and ECFP_4. The performance of voting-based consensus clustering method outperformed the Ward's method using F-measure and QPI method for both ALOGP and ECFP_4 fingerprints, while the graph-based consensus clustering methods outperformed the Ward's method only for ALOGP using QPI. The Jaccard and Euclidean distance measures were the methods of choice to generate the ensembles, which give the highest values for both criteria. Conclusions: The results of the experiments show that consensus clustering methods can improve the effectiveness of chemical structures clusterings. The cumulative voting-based aggregation algorithm (CVAA) was the method of choice among consensus clustering methods.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Voting-based consensus clustering for combining multiple clusterings of chemical structures
    Faisal Saeed
    Naomie Salim
    Ammar Abdo
    Journal of Cheminformatics, 4
  • [2] Information Theory and Voting Based Consensus Clustering for Combining Multiple Clusterings of Chemical Structures
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    MOLECULAR INFORMATICS, 2013, 32 (07) : 591 - 598
  • [3] Adaptive Cumulative Voting-Based Aggregation Algorithm for Combining Multiple Clusterings of Chemical Structures
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    Hentabli, Hamza
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2013), PT II, 2013, 7803 : 305 - 314
  • [4] Combining Multiple Clusterings of Chemical Structures Using Cumulative Voting-Based Aggregation Algorithm
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    Hentabli, Hamza
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2013), PT II, 2013, 7803 : 178 - 185
  • [5] Graph-Based Consensus Clustering for Combining Multiple Clusterings of Chemical Structures
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    Hentabli, Hamza
    MOLECULAR INFORMATICS, 2013, 32 (02) : 165 - 178
  • [6] Using Soft Consensus Clustering for Combining Multiple Clusterings of Chemical Structures
    Saeed, Faisal
    Salim, Naomie
    JURNAL TEKNOLOGI, 2013, 63 (01):
  • [7] Consensus Methods for Combining Multiple Clusterings of Chemical Structures
    Saeed, Faisal
    Salim, Naomie
    Abdo, Ammar
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2013, 53 (05) : 1026 - 1034
  • [8] Weighted voting-based consensus clustering for chemical structure databases
    Saeed, Faisal
    Ahmed, Ali
    Shamsir, Mohd Shahir
    Salim, Naomie
    JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2014, 28 (06) : 675 - 684
  • [9] Weighted voting-based consensus clustering for chemical structure databases
    Faisal Saeed
    Ali Ahmed
    Mohd Shahir Shamsir
    Naomie Salim
    Journal of Computer-Aided Molecular Design, 2014, 28 : 675 - 684
  • [10] Combining multiple classifications of chemical structures using consensus clustering
    Chu, Chia-Wei
    Holliday, John D.
    Willett, Peter
    BIOORGANIC & MEDICINAL CHEMISTRY, 2012, 20 (18) : 5366 - 5371