Comprehensive ensemble in QSAR prediction for drug discovery

被引:104
|
作者
Kwon, Sunyoung [1 ,3 ]
Bae, Ho [2 ]
Jo, Jeonghee [2 ]
Yoon, Sungroh [1 ,2 ,4 ,5 ,6 ,7 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea
[2] Seoul Natl Univ, Interdisciplinary Program Bioinformat, Seoul 08826, South Korea
[3] NAVER Corp, Clova AI Res, Seongnam 13561, South Korea
[4] Seoul Natl Univ, Biol Sci, Seoul 08826, South Korea
[5] Seoul Natl Univ, ASRI, Seoul 08826, South Korea
[6] Seoul Natl Univ, INMC, Seoul 08826, South Korea
[7] Seoul Natl Univ, Inst Engn Res, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Ensemble-learning; Meta-learning; Drug-prediction; REGRESSION;
D O I
10.1186/s12859-019-3135-4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Quantitative structure-activity relationship (QSAR) is a computational modeling method for revealing relationships between structural properties of chemical compounds and biological activities. QSAR modeling is essential for drug discovery, but it has many constraints. Ensemble-based machine learning approaches have been used to overcome constraints and obtain reliable predictions. Ensemble learning builds a set of diversified models and combines them. However, the most prevalent approach random forest and other ensemble approaches in QSAR prediction limit their model diversity to a single subject. Results The proposed ensemble method consistently outperformed thirteen individual models on 19 bioassay datasets and demonstrated superiority over other ensemble approaches that are limited to a single subject. The comprehensive ensemble method is publicly available at . Conclusions We propose a comprehensive ensemble method that builds multi-subject diversified models and combines them through second-level meta-learning. In addition, we propose an end-to-end neural network-based individual classifier that can automatically extract sequential features from a simplified molecular-input line-entry system (SMILES). The proposed individual models did not show impressive results as a single model, but it was considered the most important predictor when combined, according to the interpretation of the meta-learning.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Comprehensive ensemble in QSAR prediction for drug discovery
    Sunyoung Kwon
    Ho Bae
    Jeonghee Jo
    Sungroh Yoon
    [J]. BMC Bioinformatics, 20
  • [2] Is QSAR relevant to drug discovery?
    Doweyko, Arthur M.
    [J]. IDRUGS, 2008, 11 (12) : 894 - 899
  • [3] Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery
    Nicolas Bosc
    Francis Atkinson
    Eloy Felix
    Anna Gaulton
    Anne Hersey
    Andrew R. Leach
    [J]. Journal of Cheminformatics, 11
  • [4] Artificial intelligence in antidiabetic drug discovery: The advances in QSAR and the prediction of α-glucosidase inhibitors
    Odugbemi, Adeshina I.
    Nyirenda, Clement
    Christoffels, Alan
    Egieyeh, Samuel A.
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2024, 23 : 2964 - 2977
  • [5] Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery
    Bosc, Nicolas
    Atkinson, Francis
    Felix, Eloy
    Gaulton, Anna
    Hersey, Anne
    Leach, Andrew R.
    [J]. JOURNAL OF CHEMINFORMATICS, 2019, 11 (1)
  • [6] Ensemble Docking in Drug Discovery
    Amaro, Rommie E.
    Baudry, Jerome
    Chodera, John
    Demir, Ozlem
    McCammon, J. Andrew
    Miao, Yinglong
    Smith, Jeremy C.
    [J]. BIOPHYSICAL JOURNAL, 2018, 114 (10) : 2271 - 2278
  • [7] THE PHYSICOCHEMICAL APPROACH TO DRUG DESIGN AND DISCOVERY (QSAR)
    HANSCH, C
    [J]. DRUG DEVELOPMENT RESEARCH, 1981, 1 (04) : 267 - 309
  • [8] Multi-dimensional QSAR in drug discovery
    Lill, Markus A.
    [J]. DRUG DISCOVERY TODAY, 2007, 12 (23-24) : 1013 - 1017
  • [9] The impact of QSAR and CADD methods in drug discovery
    Wermuth, CG
    [J]. RATIONAL APPROACHES TO DRUG DESIGN, 2001, : 3 - 20
  • [10] Missed opportunities in large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery
    Damjan Krstajic
    [J]. Journal of Cheminformatics, 11