Learning from positive examples when the negative class is undetermined-microRNA gene identification

被引:41
|
作者
Yousef, Malik [1 ,3 ]
Jung, Segun [1 ,2 ,4 ]
Showe, Louise C. [1 ]
Showe, Michael K. [1 ]
机构
[1] Wistar Inst Anat & Biol, Syst Biol Div, Philadelphia, PA 19104 USA
[2] Drexel Univ, Sch Biomed Engn, Sci & Hlth Syst, Philadelphia, PA 19104 USA
[3] Coll Sakhnin, Sakhnin, Israel
[4] NYU, Sch Med, Sackler Inst Grad Biomed Sci, New York, NY 10016 USA
关键词
D O I
10.1186/1748-7188-3-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The application of machine learning to classification problems that depend only on positive examples is gaining attention in the computational biology community. We and others have described the use of two- class machine learning to identify novel miRNAs. These methods require the generation of an artificial negative class. However, designation of the negative class can be problematic and if it is not properly done can affect the performance of the classifier dramatically and/ or yield a biased estimate of performance. We present a study using one- class machine learning for microRNA ( miRNA) discovery and compare one- class to two- class approaches using naive Bayes and Support Vector Machines. These results are compared to published two- class miRNA prediction approaches. We also examine the ability of the one- class and two- class techniques to identify miRNAs in newly sequenced species. Results: Of all methods tested, we found that 2- class naive Bayes and Support Vector Machines gave the best accuracy using our selected features and optimally chosen negative examples. One class methods showed average accuracies of 70 - 80% versus 90% for the two 2- class methods on the same feature sets. However, some one- class methods outperform some recently published two- class approaches with different selected features. Using the EBV genome as and external validation of the method we found one- class machine learning to work as well as or better than a two- class approach in identifying true miRNAs as well as predicting new miRNAs. Conclusion: One and two class methods can both give useful classification accuracies when the negative class is well characterized. The advantage of one class methods is that it eliminates guessing at the optimal features for the negative class when they are not well defined. In these cases one-class methods can be superior to two- class methods when the features which are chosen as representative of that positive class are well defined. Availability: The OneClassmiRNA program is available at: [1].
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Learning from positive examples when the negative class is undetermined- microRNA gene identification
    Malik Yousef
    Segun Jung
    Louise C Showe
    Michael K Showe
    Algorithms for Molecular Biology, 3
  • [2] Learning DMEs from Positive and Negative Examples
    Li, Yeting
    Dong, Chunmei
    Chu, Xinyu
    Chen, Haiming
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 434 - 438
  • [3] Learning when negative examples abound
    Kubat, M
    Holte, R
    Matwin, S
    MACHINE LEARNING : ECML-97, 1997, 1224 : 146 - 153
  • [4] Learning Description Logic Concepts: When can Positive and Negative Examples be Separated?
    Funk, Maurice
    Jung, Jean Christoph
    Lutz, Carsten
    Pulcini, Hadrien
    Wolter, Frank
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 1682 - 1688
  • [5] Learning to Classify Neutral Examples from Positive and Negative Opinions
    Martin-Valdivia, Maria-Teresa
    Montejo-Raez, Arturo
    Urena-Lopez, Alfonso
    Rushdi Saleh, Mohammed
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2012, 18 (16) : 2319 - 2333
  • [6] Learning of simple conceptual graphs from positive and negative examples
    Kuznetsov, SO
    PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1999, 1704 : 384 - 391
  • [7] Learning from positive data and negative counter examples: A survey
    Kinber, Efim, 1600, Springer Verlag (8808):
  • [8] Learning from Positive and Negative Examples: Dichotomies and Parameterized Algorithms
    Lingg, Jonas
    Oliveira, Mateus de Oliveira
    Wolf, Petra
    COMBINATORIAL ALGORITHMS (IWOCA 2022), 2022, 13270 : 398 - 411
  • [9] Complexity of learning in concept lattices from positive and negative examples
    Kuznetsov, SO
    DISCRETE APPLIED MATHEMATICS, 2004, 142 (1-3) : 111 - 125
  • [10] Learning from positive and negative examples: New proof for binary alphabets
    Lingg, Jonas
    de Oliveira Oliveira, Mateus
    Wolf, Petra
    Information Processing Letters, 2024, 183