Feasibility of Active Machine Learning for Multiclass Compound Classification

被引:30
|
作者
Lang, Tobias [1 ,2 ]
Flachsenberg, Florian [1 ]
von Luxburg, Ulrike [3 ]
Rarey, Matthias [1 ]
机构
[1] Univ Hamburg, Ctr Bioinformat, D-20146 Hamburg, Germany
[2] Univ Hamburg, Dept Comp Sci, Schluterstr 70, D-20146 Hamburg, Germany
[3] Univ Tubingen, Dept Comp Sci, D-72076 Tubingen, Germany
关键词
DISCOVERY; TOOL;
D O I
10.1021/acs.jcim.5b00332
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.
引用
收藏
页码:12 / 20
页数:9
相关论文
共 50 条
  • [31] Classification of Hindi Compound Nouns Using Machine Learning
    Vandana Dwivedi
    Sanjukta Ghosh
    SN Computer Science, 2022, 3 (1)
  • [32] Multiclass classification of dry beans using computer vision and machine learning techniques
    Koklu, Murat
    Ozkan, Ilker Ali
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2020, 174
  • [33] Multiclass Brain Tumor Classification Using Hyperspectral Imaging and Supervised Machine Learning
    Ruiz, Luisa
    Martin, Alberto
    Urbanos, Gemma
    Villanueva, Marta
    Sancho, Jaime
    Rosa, Gonzalo
    Villa, Manuel
    Chavarrias, Miguel
    Perez, Angel
    Juarez, Eduardo
    Lagares, Alfonso
    Sanz, Cesar
    2020 XXXV CONFERENCE ON DESIGN OF CIRCUITS AND INTEGRATED SYSTEMS (DCIS), 2020,
  • [34] Preoperative multiclass classification of thymic mass lesions based on radiomics and machine learning
    Zhu, Yan
    Wang, Li
    Ruan, Aichao
    Peng, Zhiyu
    Zhang, Zhenzhong
    CANCER IMAGING, 2025, 25 (01)
  • [35] Probability based voting extreme learning machine for multiclass XML documents classification
    Xiangguo Zhao
    Xin Bi
    Baiyou Qiao
    World Wide Web, 2014, 17 : 1217 - 1231
  • [36] Predictive modeling of gestational weight gain: a machine learning multiclass classification study
    Victor, Audencio
    dos Santos, Hellen Geremias
    Silva, Gabriel Ferreira Santos
    Barcellos Filho, Fabiano
    Cobre, Alexandre de Fatima
    Luzia, Liania A.
    Rondo, Patricia H. C.
    Chiavegatto Filho, Alexandre Dias Porto
    BMC PREGNANCY AND CHILDBIRTH, 2024, 24 (01)
  • [37] Multiclass Classification of Cancer Based on Microarray Data Using Extreme Learning Machine
    Khadijah
    Rismiyati
    Mantau, Aprinaldi Jasa
    2017 1ST INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS), 2017, : 159 - 164
  • [38] Comparative Analysis of Multiclass Classification Machine Learning Models for Cybersecurity Intrusion Detection
    Loughmari, Mohamed
    El Affar, Anass
    DIGITAL TECHNOLOGIES AND APPLICATIONS, ICDTA 2024, VOL 2, 2024, 1099 : 97 - 108
  • [39] Multiclass Mood Classification on Twitter Using Lexicon Dictionary and Machine Learning Algorithms
    Gaikwad, Govin
    Joshi, Deepali J.
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 1, 2016, : 512 - 517
  • [40] Multiclass classification of overburden dump slope stability conditions deploying machine learning
    Gupta, Gagan
    Sharma, Sanjay Kr
    Singh, G.S.P.
    Singh, Sanjay Kr
    5th ISRM Young Scholars' Symposium on Rock Mechanics and International Symposium on Rock Engineering for Innovative Future, YSRM 2019, 2019, : 462 - 466