A stochastic approximation approach to fixed instance selection

被引:0
|
作者
Yeo G.F.A. [1 ]
Akman D. [1 ]
Hudson I. [1 ]
Chan J. [2 ]
机构
[1] School of Science (Mathematical Sciences), Royal Melbourne Institute of Technology, 124 La Trobe St, Melbourne, 3000, VIC
[2] School of Computing Technologies (Computer Science), Royal Melbourne Institute of Technology, 124 La Trobe St, Melbourne, 3000, VIC
关键词
Dimensionality reduction; Gradient descent optimisation; Instance selection; Stochastic approximation;
D O I
10.1016/j.ins.2023.01.090
中图分类号
学科分类号
摘要
Instance selection plays a critical role in enhancing the efficacy and efficiency of machine learning tools when utilised for a data mining task. This study proposes a fixed instance selection algorithm based on simultaneous perturbation stochastic approximation that works in conjunction with any supervised machine learning method and any corresponding performance metric, which we call SpFixedIS. This algorithm provides an approximate solution to the NP-hard instance selection problem and additionally serves as a way of intelligently selecting a specified number of instances within a training set with regards to a machine learning model. The shape of the objective function obtained from the test accuracy against the number of instances selected is examined extensively for our instance selection algorithm. The SpFixedIS algorithm was tested on 43 diverse datasets across 6 different machine learning classifiers. The results show that in over 90% of cases SpFixedIS provides a statistically significant improvement at a 5% level with intelligent selection over random selection for the same number of instances. Furthermore, with respect to probabilistic models, specifically Gaussian Naive Bayes, SpFixedIS provides a statistically significant improvement compared to models that utilise the entirety of the training set in 84% of the experimented values ranging from 50 to 1000 instances. © 2023
引用
收藏
页码:558 / 579
页数:21
相关论文
共 50 条
  • [1] SpIS: A stochastic approximation approach to minimal subset instance selection
    Yeo, Guo Feng Anders
    Hudson, Irene
    Akman, David
    Chan, Jeffrey
    INFORMATION SCIENCES, 2025, 695
  • [2] An Instance Selection Approach to Multiple Instance Learning
    Fu, Zhouyu
    Robles-Kelly, Antonio
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 911 - +
  • [3] An Efficient Approach for Instance Selection
    Carbonera, Joel Luis
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2017, 2017, 10440 : 228 - 243
  • [4] An instance voting approach to feature selection
    Chamakura, Lily
    Saha, Goutam
    INFORMATION SCIENCES, 2019, 504 : 449 - 469
  • [5] A Differential Evolution Approach to Feature Selection and Instance Selection
    Wang, Jiaheng
    Xue, Bing
    Gao, Xiaoying
    Zhang, Mengjie
    PRICAI 2016: TRENDS IN ARTIFICIAL INTELLIGENCE, 2016, 9810 : 588 - 602
  • [6] A stochastic approximation approach to simultaneous feature weighting and selection for nearest neighbour learners
    Yeo, Guo Feng Anders
    Aksakalli, Vural
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 185 (185)
  • [7] SEQUENTIAL BID SELECTION BY STOCHASTIC APPROXIMATION
    AGNEW, RA
    NAVAL RESEARCH LOGISTICS QUARTERLY, 1972, 19 (01): : 137 - &
  • [8] ROBUST STOCHASTIC APPROXIMATION APPROACH TO STOCHASTIC PROGRAMMING
    Nemirovski, A.
    Juditsky, A.
    Lan, G.
    Shapiro, A.
    SIAM JOURNAL ON OPTIMIZATION, 2009, 19 (04) : 1574 - 1609
  • [9] Optimal and instance-dependent guarantees for Markovian linear stochastic approximation
    Mou, Wenlong
    Pananjady, Ashwin
    Wainwright, Martin J.
    Bartlett, Peter L.
    CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178
  • [10] A density-based approach for instance selection
    Carbonera, Joel Luis
    Abel, Mara
    2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 768 - 774