chemmodlab: a cheminformatics modeling laboratoryR package for fitting and assessing machine learning models

被引:0
|
作者
Ash, Jeremy R. [1 ]
Hughes-Oliver, Jacqueline M. [2 ]
机构
[1] North Carolina State Univ, Bioinformat Res Ctr, Dept Stat, 335 Ricks Hall,Campus Box 7566, Raleigh, NC 27695 USA
[2] North Carolina State Univ, Dept Stat, 2311 Stinson Dr,Campus Box 8203, Raleigh, NC 27695 USA
来源
关键词
Machine learning; QSAR; R package; Initial enhancement; Enrichment factor; Accumulation curve; Hit enrichment curve; Repeated cross-validation; CROSS-VALIDATION; SELECTION BIAS; ERROR RATE; PREDICTION; PROPERTY;
D O I
10.1186/s13321-018-0309-4
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The goal of chemmodlab is to streamline the fitting and assessment pipeline for many machine learning models in R, making it easy for researchers to compare the utility of these models. While focused on implementing methods for model fitting and assessment that have been accepted by experts in the cheminformatics field, all of the methods in chemmodlab have broad utility for the machine learning community. chemmodlab contains several assessment utilities, including a plotting function that constructs accumulation curves and a function that computes many performance measures. The most novel feature of chemmodlab is the ease with which statistically significant performance differences for many machine learning models is presented by means of the multiple comparisons similarity plot. Differences are assessed using repeated k-fold cross validation, where blocking increases precision and multiplicity adjustments are applied. chemmodlab is freely available on CRAN at https://cran.r-project.org/web/packages/chemmodlab/index.html.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models
    Jeremy R. Ash
    Jacqueline M. Hughes-Oliver
    Journal of Cheminformatics, 10
  • [2] Cheminformatics Based Machine Learning Approaches for Assessing Glycolytic Pathway Antagonists of Mycobacterium tuberculosis
    Tiwari, Kanupriya
    Jamal, Salma
    Grover, Sonam
    Goyal, Sukriti
    Singh, Aditi
    Grover, Abhinav
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2016, 19 (08) : 667 - 675
  • [3] Machine learning models and over-fitting considerations
    Charilaou, Paris
    Battat, Robert
    WORLD JOURNAL OF GASTROENTEROLOGY, 2022, 28 (05) : 605 - 607
  • [4] Machine learning models and over-fitting considerations
    Paris Charilaou
    Robert Battat
    World Journal of Gastroenterology, 2022, (05) : 605 - 607
  • [5] ExplaineR: an R package to explain machine learning models
    Zargari Marandi, Ramtin
    BIOINFORMATICS ADVANCES, 2024, 4 (01):
  • [6] Machine Learning in Assessing the Performance of Hydrological Models
    Rozos, Evangelos
    Dimitriadis, Panayiotis
    Bellos, Vasilis
    HYDROLOGY, 2022, 9 (01)
  • [7] Fitting Multiple Machine Learning Models With Performance Based Clustering
    Lorasdagi, Mehmet E.
    Koc, Ahmet B.
    Koc, Ali T.
    Kozat, Suleyman S.
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 816 - 820
  • [8] survex: an R package for explaining machine learning survival models
    Spytek, Mikolaj
    Krzyzinski, Mateusz
    Langbein, Sophie Hanna
    Baniecki, Hubert
    Wright, Marvin N.
    Biecek, Przemyslaw
    BIOINFORMATICS, 2023, 39 (12)
  • [9] The forestecology R package for fitting and assessing neighborhood models of the effect of interspecific competition on the growth of trees
    Kim, Albert Y.
    Allen, David N.
    Couch, Simon P.
    ECOLOGY AND EVOLUTION, 2021, 11 (22): : 15556 - 15572
  • [10] Explainable Machine Learning Models Assessing Lending Risk
    Nassiri, Khalid
    Akhloufi, Moulay A.
    NAVIGATING THE TECHNOLOGICAL TIDE: THE EVOLUTION AND CHALLENGES OF BUSINESS MODEL INNOVATION, VOL 3, ICBT 2024, 2024, 1082 : 519 - 529