Manifold-based synthetic oversampling with manifold conformance estimation

被引:46
|
作者
Bellinger, Colin [1 ]
Drummond, Christopher [3 ]
Japkowicz, Nathalie [2 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[2] American Univ, Dept Comp Sci, Washington, DC USA
[3] Natl Res Council Canada, Ottawa, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Class imbalance; Synthetic oversampling; Manifold learning; SMOTE; DIMENSIONALITY REDUCTION; DETERMINING NUMBER; NEURAL-NETWORKS; CLASSIFICATION; SMOTE;
D O I
10.1007/s10994-017-5670-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification domains such as those in medicine, national security and the environment regularly suffer from a lack of training instances for the class of interest. In many cases, classification models induced under these conditions have poor predictive performance on the important minority class. Synthetic oversampling can be applied to mitigate the impact of imbalance by generating additional training instances. In this field, the majority of research has focused on refining the SMOTE algorithm. We note, however, that the generative bias of SMOTE is not appropriate for the large class of learning problems that conform to the manifold property. These are high-dimensional problems, such as image and spectral classification, with implicit feature spaces that are lower-dimensional than their physical data spaces. We show that ignoring this can lead to instances being generated in erroneous regions of the data space. We propose a general framework for manifold-based synthetic oversampling that helps users to select a domain-appropriate manifold learning method, such as PCA or autoencoder, and apply it to model and generate additional training samples. We evaluate data generation on theoretical distributions and image classification tasks that are standard in the manifold learning literature, and empirically show its positive impact on the classification of high-dimensional image and gamma-ray spectra tasks, along with 16 UCI datasets.
引用
收藏
页码:605 / 637
页数:33
相关论文
共 50 条
  • [41] Manifold-Based Reinforcement Learning via Locally Linear Reconstruction
    Xu, Xin
    Huang, Zhenhua
    Zuo, Lei
    He, Haibo
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (04) : 934 - 947
  • [42] Isogeometric analysis using manifold-based smooth basis functions
    Majeed, M.
    Cirak, F.
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2017, 316 : 547 - 567
  • [43] A novel projection strategy for manifold-based chemistry reduction models
    Bao, Hesheng
    Van Oijen, Jeroen
    [J]. PROCEEDINGS OF THE COMBUSTION INSTITUTE, 2024, 40 (1-4)
  • [44] Manifold-based Shapley explanations for high dimensional correlated features
    Hu, Xuran
    Zhu, Mingzhe
    Feng, Zhenpeng
    Stankovic, Ljubisa
    [J]. NEURAL NETWORKS, 2024, 180
  • [45] Constraint-free discretized manifold-based path planner
    Radhakrishnan, Sindhu
    Gueaieb, Wail
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2023, 7 (04) : 810 - 855
  • [46] Matrix Manifold-Based Performance Monitoring of Automatic Control Systems
    Xu, Yunsong
    Yu, Han
    Zhao, Zhengen
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (01): : 29 - 37
  • [47] ROBUST SUPER-RESOLUTION GAN, WITH MANIFOLD-BASED AND PERCEPTION LOSS
    Upadhyay, Uddeshya
    Awate, Suyash P.
    [J]. 2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 1372 - 1376
  • [48] Global rank preservation learning machine based on manifold-based discriminant analysis
    Zhang, Jing
    Liu, Zhong-Bao
    [J]. Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2015, 44 (06): : 911 - 916
  • [49] MANIFOLD-BASED BAYESIAN INFERENCE FOR SEMI-SUPERVISED SOURCE LOCALIZATION
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6335 - 6339
  • [50] HIGH DYNAMIC RANGE IMAGE PROCESSING USING MANIFOLD-BASED ORDERING
    Lezoray, Olivier
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 289 - 294