Combining Supervised and Unsupervised Machine Learning Methods for Phenotypic Functional Genomics Screening

被引:9
|
作者
Omta, Wienand A. [1 ,2 ,3 ]
van Heesbeen, Roy G. [4 ]
Shen, Ian [2 ]
de Nobel, Jacob [1 ]
Robers, Desmond [1 ]
van Der Velden, Lieke M. [1 ]
Medema, Rene H. [4 ]
Siebes, Arno P. J. M. [2 ]
Feelders, Ad J. [2 ]
Brinkkemper, Sjaak [2 ]
Klumpermanl, Judith S. [1 ]
Spruit, Marco Rene [2 ]
Brinkhuis, Matthieu J. S. [2 ]
Egan, David A. [3 ]
机构
[1] UMC Utrecht, Dept Cell Biol, Ctr Mol Med, Utrecht, Netherlands
[2] Univ Utrecht, Dept Informat & Comp Sci, Utrecht, Netherlands
[3] Core Life Analyt BV, Padualaan 8, NL-3584 CH Utrecht, Netherlands
[4] NKI AVL, Dept Cell Biol, Amsterdam, Noord Holland, Netherlands
关键词
artificial intelligence; supervised machine learning; classification; phenotypic profiles;
D O I
10.1177/2472555220919345
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
There has been an increase in the use of machine learning and artificial intelligence (AI) for the analysis of image-based cellular screens. The accuracy of these analyses, however, is greatly dependent on the quality of the training sets used for building the machine learning models. We propose that unsupervised exploratory methods should first be applied to the data set to gain a better insight into the quality of the data. This improves the selection and labeling of data for creating training sets before the application of machine learning. We demonstrate this using a high-content genome-wide small interfering RNA screen. We perform an unsupervised exploratory data analysis to facilitate the identification of four robust phenotypes, which we subsequently use as a training set for building a high-quality random forest machine learning model to differentiate four phenotypes with an accuracy of 91.1% and a kappa of 0.85. Our approach enhanced our ability to extract new knowledge from the screen when compared with the use of unsupervised methods alone.
引用
收藏
页码:655 / 664
页数:10
相关论文
共 50 条
  • [1] A Novel Classifier Combining Supervised and Unsupervised Learning Methods
    Chmielnicki, Wieslaw
    [J]. 2016 THIRD EUROPEAN NETWORK INTELLIGENCE CONFERENCE (ENIC 2016), 2016, : 232 - 238
  • [2] Improving facies prediction by combining supervised and unsupervised learning methods
    Ippolito, Marco
    Ferguson, John
    Jenson, Fred
    [J]. JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2021, 200
  • [3] Combining supervised and unsupervised machine learning algorithms to predict the learners' learning styles
    El Aissaoui, Ouafae
    El Alami El Madani, Yasser
    Oughdir, Lahcen
    El Allioui, Youssouf
    [J]. SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS2018), 2019, 148 : 87 - 96
  • [4] Combining unsupervised and supervised machine learning in analysis of the CHD patient database
    Smuc, T
    Gamberger, D
    Krstacic, G
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, PROCEEDINGS, 2001, 2101 : 109 - 112
  • [5] Machine learning in the Genomics era - Editorial: Methods in functional genomics
    Sebastiani, P
    Kohane, IS
    Ramoni, MF
    [J]. MACHINE LEARNING, 2003, 52 (1-2) : 5 - 9
  • [6] Linking protolith rocks to altered equivalents by combining unsupervised and supervised machine learning
    Hood, Shawn B.
    Cracknell, Matthew J.
    Gazley, Michael F.
    [J]. JOURNAL OF GEOCHEMICAL EXPLORATION, 2018, 186 : 270 - 280
  • [7] An unsupervised learning approach to resolving the data imbalanced issue in supervised learning problems in functional genomics
    Yoon, K
    Kwek, S
    [J]. HIS 2005: 5TH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, PROCEEDINGS, 2005, : 303 - 308
  • [8] Classification of lidar measurements using supervised and unsupervised machine learning methods
    Farhani, Ghazal
    Sica, Robert J.
    Daley, Mark Joseph
    [J]. ATMOSPHERIC MEASUREMENT TECHNIQUES, 2021, 14 (01) : 391 - 402
  • [9] Combining unsupervised and supervised learning techniques for enhancing the performance of functional data classifiers
    Maturo, Fabrizio
    Verde, Rosanna
    [J]. COMPUTATIONAL STATISTICS, 2024, 39 (01) : 239 - 270
  • [10] Combining unsupervised and supervised learning techniques for enhancing the performance of functional data classifiers
    Fabrizio Maturo
    Rosanna Verde
    [J]. Computational Statistics, 2024, 39 : 239 - 270