Introducing the consensus modeling concept in genetic algorithms: Application to interpretable discriminant analysis

被引:17
|
作者
Ganguly, Milan
Brown, Nathan [1 ]
Schuffenhauer, Ansgar
Ertl, Peter
Gillet, Valerie J.
Greenidge, Paulette A.
机构
[1] Novartis Inst BioMed Res, CH-4002 Basel, Switzerland
[2] Univ Sheffield, Krebs Inst Biomol Res, Sheffield S10 2TN, S Yorkshire, England
[3] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
关键词
D O I
10.1021/ci050529l
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
An evolutionary statistical learning method was applied to classify drugs according to their biological target and also to discriminate between a compilation of oral and nonoral drugs. The emphasis was placed not only on how well the models predict but also on their interpretability. In an enhancement to previous studies, the consistency of the model weights over several runs of the genetic algorithm was considered with the goal of producing comprehensible models. Via this approach, the descriptors and their ranges that contribute most to class discrimination were identified. Selecting a bin step size that enables the average descriptor properties of the class being trained to be captured improves the interpretability and discriminatory power of a model. The performance, consistency, and robustness of such models were further enhanced by using two novel approaches that reduce the variability between individual solutions: consensus and splice modeling. Finally, the ability of the genetic algorithm to discriminate between activity classes was compared with a similarity searching method, while naive Bayes classifiers and support vector machines were applied in discriminating the oral and nonoral drugs.
引用
收藏
页码:2110 / 2124
页数:15
相关论文
共 50 条
  • [21] Modeling and analysis of genetic algorithms using neural networks
    Imai, J
    Yoshikawa, T
    Shioya, H
    Da-te, T
    COMPUTING ANTICIPATORY SYSTEMS, 2002, 627 : 365 - 372
  • [22] Application of canonical discriminant analysis for the assessment of genetic variation in tall fescue
    Vaylay, R
    van Santen, E
    CROP SCIENCE, 2002, 42 (02) : 534 - 539
  • [23] Application of genetic algorithms and Kohonen networks to cluster analysis
    Gorzalczany, MB
    Rudzinski, F
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2004, 2004, 3070 : 556 - 561
  • [24] Learning in economics - Analysis and application of genetic algorithms.
    Dawid, H
    JOURNAL OF ECONOMICS-ZEITSCHRIFT FUR NATIONALOKONOMIE, 2002, 77 (03): : 283 - 287
  • [25] Metabolic profiling using principal component analysis, discriminant partial least squares, and genetic algorithms
    Ramadan, Z
    Jacobs, D
    Grigorov, M
    Kochhar, S
    TALANTA, 2006, 68 (05) : 1683 - 1691
  • [26] Mathematical modeling analysis of genetic algorithms under schema theorem
    Liu, Donghai
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2019, 19 (S1) : S131 - S137
  • [27] Concept for the Application of Genetic Algorithms in the Management of Transport Offers in Relation to Homogenous Cargo Transport
    Wozniak, Waldemar
    Stryjski, Roman
    Mielniczuk, Janusz
    Wojnarowski, Tomasz
    INNOVATION MANAGEMENT AND SUSTAINABLE ECONOMIC COMPETITIVE ADVANTAGE: FROM REGIONAL DEVELOPMENT TO GLOBAL GROWTH, VOLS I - VI, 2015, 2015, : 2329 - 2339
  • [28] Application of Genetic Algorithms for Optimization of Salesman's Tasks and Their Modeling by Sequential Selection
    Boyko, Nataliya
    Pytel, Andriy
    COLINS 2021: COMPUTATIONAL LINGUISTICS AND INTELLIGENT SYSTEMS, VOL I, 2021, 2870
  • [29] Modeling and Analysis of Partial Power Concept for Data Center Application
    Wu, Di
    Wang, Pinhe
    Lyu, Yanda
    Arruti, Asier Romero
    Ouyang, Ziwei
    Andersen, Michael Andreas Esbern
    2024 IEEE APPLIED POWER ELECTRONICS CONFERENCE AND EXPOSITION, APEC, 2024, : 2644 - 2650
  • [30] The application of formal concept analysis for modeling hospital clinical processes
    Pan, Telung
    Fang, Kwoting
    WSEAS: ADVANCES ON APPLIED COMPUTER AND APPLIED COMPUTATIONAL SCIENCE, 2008, : 417 - 422