Feasibility of Active Machine Learning for Multiclass Compound Classification

被引:30
|
作者
Lang, Tobias [1 ,2 ]
Flachsenberg, Florian [1 ]
von Luxburg, Ulrike [3 ]
Rarey, Matthias [1 ]
机构
[1] Univ Hamburg, Ctr Bioinformat, D-20146 Hamburg, Germany
[2] Univ Hamburg, Dept Comp Sci, Schluterstr 70, D-20146 Hamburg, Germany
[3] Univ Tubingen, Dept Comp Sci, D-72076 Tubingen, Germany
关键词
DISCOVERY; TOOL;
D O I
10.1021/acs.jcim.5b00332
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
A common task in the hit-to-lead process is classifying sets of compounds into multiple, usually structural classes, which build the groundwork for subsequent SAR studies. Machine learning techniques can be used to automate this process by learning classification models from training compounds of each class. Gathering class information for compounds can be cost-intensive as the required data needs to be provided by human experts or experiments. This paper studies whether active machine learning can be used to reduce the required number of training compounds. Active learning is a machine learning method which processes class label data in an iterative fashion. It has gained much attention in a broad range of application areas. In this paper, an active learning method for multiclass compound classification is proposed. This method selects informative training compounds so as to optimally support the learning progress. The combination with human feedback leads to a semiautomated interactive multiclass classification procedure. This method was investigated empirically on 15 compound classification tasks containing 86-2870 compounds in 3-38 classes. The empirical results show that active learning can solve these classification tasks using 10-80% of the data which would be necessary for standard learning techniques.
引用
收藏
页码:12 / 20
页数:9
相关论文
共 50 条
  • [41] The Multiclass Classification of Newspaper Articles with Machine Learning: The Hybrid Binary Snowball Approach
    Sebok, Miklos
    Kacsuk, Zoltan
    POLITICAL ANALYSIS, 2021, 29 (02) : 236 - 249
  • [42] Comparing Multiclass, Binary, and Hierarchical Machine Learning Classification schemes for variable stars
    Hosenie, Zafiirah
    Lyon, Robert J.
    Stappers, Benjamin W.
    Mootoovaloo, Arrykrishna
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2019, 488 (04) : 4858 - 4872
  • [43] Analysis of the feasibility of using deep learning for multiclass classification of dental anomalies on panoramic radiographs
    Okazaki, Shota
    Mine, Yuichi
    Iwamoto, Yuko
    Urabe, Shiho
    Mitsuhata, Chieko
    Nomura, Ryota
    Kakimoto, Naoya
    Murayama, Takeshi
    DENTAL MATERIALS JOURNAL, 2022, 41 (06) : 889 - 895
  • [44] Active Learning for Multiclass Cost-Sensitive Classification Using Probabilistic Models
    Chen, Po-Lung
    Lin, Hsuan-Tien
    2013 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2013, : 13 - 18
  • [45] Supervised machine learning and active learning in classification of radiology reports
    Nguyen, Dung H. M.
    Patrick, Jon D.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (05) : 893 - 901
  • [46] Multiclass classification machine based on the analytical center
    Li, XQ
    Yue, JH
    Leng, YG
    2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1471 - 1474
  • [47] Effectuating Supervised Machine Learning Techniques for Multiclass Classification of Problematic Internet and Mobile Usage
    Sarkar, Sneha
    Bhandary, Samanyu
    Arya, Arti
    2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 1 - 8
  • [48] Automatic Detection of Epilepsy and Seizure Using Multiclass Sparse Extreme Learning Machine Classification
    Wang, Yuanfa
    Li, Zunchao
    Feng, Lichen
    Zheng, Chuang
    Zhang, Wenhao
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2017, 2017
  • [49] Machine Learning Methods for Binary and Multiclass Classification of Melanoma Thickness From Dermoscopic Images
    Saez, Aurora
    Sanchez-Monedero, Javier
    Antonio Gutierrez, Pedro
    Hervas-Martinez, Cesar
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (04) : 1036 - 1045
  • [50] Explainable Machine Learning for the Multiclass Classification of Diffuse Reflectance Spectroscopy Signals in Orthopaedic Applications
    Rossberg, Nicola
    Li, Celina L.
    Andersson-Engels, Stephan
    O'Sullivan, Barry
    Komolibus, Katarzyna
    Visentina, Andrea
    DATA SCIENCE FOR PHOTONICS AND BIOPHOTONICS, 2024, 13011