Analysis of Learning Influence of Training Data Selected by Distribution Consistency

被引:2
|
作者
Hwang, Myunggwon [1 ,2 ]
Jeong, Yuna [1 ]
Sung, Won-Kyung [1 ,2 ]
机构
[1] Korea Inst Sci & Technol Informat, Intelligent Infrastruct Technol Res Ctr, Daejeon 34141, South Korea
[2] Univ Sci & Technol, Dept Data & HPC Sci, Daejeon 34113, South Korea
关键词
learning influence; machine learning; training data similarity; distribution consistency;
D O I
10.3390/s21041045
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
This study suggests a method to select core data that will be helpful for machine learning. Specifically, we form a two-dimensional distribution based on the similarity of the training data and compose grids with fixed ratios on the distribution. In each grid, we select data based on the distribution consistency (DC) of the target class data and examine how it affects the classifier. We use CIFAR-10 for the experiment and set various grid ratios from 0.5 to 0.005. The influences of these variables were analyzed with the use of different training data sizes selected based on high-DC, low-DC (inverse of high DC), and random (no criteria) selections. As a result, the average point accuracy at 0.95% (+/- 0.65) and the point accuracy at 1.54% (+/- 0.59) improved for the grid configurations of 0.008 and 0.005, respectively. These outcomes justify an improved performance compared with that of the existing approach (data distribution search). In this study, we confirmed that the learning performance improved when the training data were selected for very small grid and high-DC settings.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [11] Analysis of the Influence of Training Data on Road User Detection
    Guindel, Carlos
    Martin, David
    Maria Armingol, Jose
    Stiller, Christoph
    2018 IEEE INTERNATIONAL CONFERENCE ON VEHICULAR ELECTRONICS AND SAFETY (ICVES 2018), 2018,
  • [12] Application of deep learning in sheep behaviors recognition and influence analysis of training data characteristics on the recognition effect
    Cheng, Man
    Yuan, Hongbo
    Wang, Qifan
    Cai, Zhenjiang
    Liu, Yueqin
    Zhang, Yingjie
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 198
  • [13] Interpolation Consistency Training for Semi-Supervised Learning
    Verma, Vikas
    Lamb, Alex
    Kannala, Juho
    Bengio, Yoshua
    Lopez-Paz, David
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3635 - 3641
  • [14] Interpolation consistency training for semi-supervised learning
    Verma, Vikas
    Kawaguchi, Kenji
    Lamb, Alex
    Kannala, Juho
    Solin, Arno
    Bengio, Yoshua
    Lopez-Paz, David
    NEURAL NETWORKS, 2022, 145 : 90 - 106
  • [15] Influence of Data Distribution on Federated Learning Performance in Tumor Segmentation
    Luo, Guibo
    Liu, Tianyu
    Lu, Jinghui
    Chen, Xin
    Yu, Lequan
    Wu, Jian
    Chen, Danny Z.
    Cai, Wenli
    RADIOLOGY-ARTIFICIAL INTELLIGENCE, 2023, 5 (03)
  • [16] Consistency and prior falsification of training data in seismic deep learning: Application to offshore deltaic reservoir characterization
    Pradhan, Anshuman
    Mukerji, Tapan
    GEOPHYSICS, 2022, 87 (03) : N45 - N61
  • [17] Training data distribution significantly impacts the estimation of tissue microstructure with machine learning
    Gyori, Noemi G.
    Palombo, Marco
    Clark, Christopher A.
    Zhang, Hui
    Alexander, Daniel C.
    MAGNETIC RESONANCE IN MEDICINE, 2022, 87 (02) : 932 - 947
  • [18] Learning when training data are costly: The effect of class distribution on tree induction
    Weiss, GM
    Provost, F
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2003, 19 : 315 - 354
  • [19] Learning when training data are costly: The effect of class distribution on tree induction
    Weiss, G.M. (GMWEISS@ATT.COM), 1600, American Association for Artificial Intelligence (19):
  • [20] Deep Attentive Video Summarization With Distribution Consistency Learning
    Ji, Zhong
    Zhao, Yuxiao
    Pang, Yanwei
    Li, Xi
    Han, Jungong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (04) : 1765 - 1775