Selecting training instances for supervised classification

被引:0
|
作者
Roiger, R
Cornell, L
机构
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several experimental studies have tested the relative merits of various supervised machine learning models, Comparisons have been made along dimensions that include model complexity, prediction accuracy, training set size, and training time. Only limited work has been dane to study the effect of training set exemplar typicality on model performance, We present experimental results obtained in testing C4.5, SX-WEB, a backpropagation neural network ard linear discriminant analysis using a real-valued and a mixed form of a medical data set. We generated training sets of highly typical, widely-varied and atypical exemplars for both data sets. We tested the classification accuracy of each model using the generated training sera. Test set accuracy levels ranged between 76% and 86% when each model was trained with typical or varied training sets. The accuracy levels for C4.5, backpropagation neural net and discriminant analysis dropped significantly when atypical training sets were used, In contrast, with the exception of one test, SX-WEB was unaffected by training set choice. When comparing the correctness of each model, SX-WEB showed the best overall performance. We conclude this paper with directions for future research.
引用
收藏
页码:150 / 155
页数:2
相关论文
共 50 条
  • [1] Comparing Three Methods of Selecting Training Samples in Supervised Classification of Multispectral Remote Sensing Images
    Zhang, Hongying
    He, Jinxin
    Chen, Shengbo
    Zhan, Ye
    Bai, Yanyan
    Qin, Yujia
    SENSORS, 2023, 23 (20)
  • [2] Selecting Training Samples for Ovarian Cancer Classification via a Semi-supervised Clustering Approach
    Salguero, Jennifer L.
    Prasanna, Prateek
    Corredor, German
    Cruz-Roa, Angel
    Becerra, David
    Romero, Eduardo
    MEDICAL IMAGING 2022: DIGITAL AND COMPUTATIONAL PATHOLOGY, 2022, 12039
  • [3] Supervised Classification Using Balanced Training
    Du, Mian
    Pierce, Matthew
    Pivovarova, Lidia
    Yangarber, Roman
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2014, 2014, 8791 : 147 - 158
  • [4] The robustness of majority voting compared to filtering misclassified instances in supervised classification tasks
    Smith, Michael R.
    Martinez, Tony
    ARTIFICIAL INTELLIGENCE REVIEW, 2018, 49 (01) : 105 - 130
  • [5] The robustness of majority voting compared to filtering misclassified instances in supervised classification tasks
    Michael R. Smith
    Tony Martinez
    Artificial Intelligence Review, 2018, 49 : 105 - 130
  • [6] Selecting Representative Instances from Datasets
    Mirisaee, Seyed Hamid
    Douzal, Ahlame
    Termier, Alexandre
    PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (IEEE DSAA 2015), 2015, : 291 - 300
  • [7] Generating fuzzy rules from training instances for fuzzy classification systems
    Chen, Shyi-Ming
    Tsai, Fu-Ming
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (03) : 611 - 621
  • [8] An Efficient Approach to Select Instances in Self-Training and Co-Training Semi-Supervised Methods
    Ovidio Vale, Karliane Medeiros
    Gorgonio, Arthur Costa
    Gorgonio, Flavius Da Luz E.
    De Paula Canuto, Anne Magaly
    IEEE ACCESS, 2022, 10 : 7254 - 7276
  • [9] Selecting the training set in classification problems with rare events
    Scarpa, B
    Torelli, N
    NEW DEVELOPMENTS IN CLASSIFICATION AND DATA ANALYSIS, 2005, : 39 - 46
  • [10] COMPLEXFUZZY: NOVEL CLUSTERING METHOD FOR SELECTING TRAINING INSTANCES OF CROSS-PROJECT DEFECT PREDICTION
    Ozturk, Muhammed Maruf
    COMPUTER SCIENCE-AGH, 2021, 22 (01): : 3 - 37