Analysis of Learning Influence of Training Data Selected by Distribution Consistency

被引:2
|
作者
Hwang, Myunggwon [1 ,2 ]
Jeong, Yuna [1 ]
Sung, Won-Kyung [1 ,2 ]
机构
[1] Korea Inst Sci & Technol Informat, Intelligent Infrastruct Technol Res Ctr, Daejeon 34141, South Korea
[2] Univ Sci & Technol, Dept Data & HPC Sci, Daejeon 34113, South Korea
关键词
learning influence; machine learning; training data similarity; distribution consistency;
D O I
10.3390/s21041045
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
This study suggests a method to select core data that will be helpful for machine learning. Specifically, we form a two-dimensional distribution based on the similarity of the training data and compose grids with fixed ratios on the distribution. In each grid, we select data based on the distribution consistency (DC) of the target class data and examine how it affects the classifier. We use CIFAR-10 for the experiment and set various grid ratios from 0.5 to 0.005. The influences of these variables were analyzed with the use of different training data sizes selected based on high-DC, low-DC (inverse of high DC), and random (no criteria) selections. As a result, the average point accuracy at 0.95% (+/- 0.65) and the point accuracy at 1.54% (+/- 0.59) improved for the grid configurations of 0.008 and 0.005, respectively. These outcomes justify an improved performance compared with that of the existing approach (data distribution search). In this study, we confirmed that the learning performance improved when the training data were selected for very small grid and high-DC settings.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [31] Concentration rate and consistency of the posterior distribution for selected priors under monotonicity constraints
    Salomond, Jean-Bernard
    ELECTRONIC JOURNAL OF STATISTICS, 2014, 8 : 1380 - 1404
  • [32] Deep learning training dynamics analysis for single-cell data
    Karin, Jonathan
    Mintz, Reshef
    NATURE COMPUTATIONAL SCIENCE, 2024, : 886 - 887
  • [33] TRANSPARANCY AND DISTRIBUTION OF INFLUENCE IN PSYCHOANALYTIC TRAINING
    HEMPRICH, RD
    ROHDEDAC.C
    ZEITSCHRIFT FUR PSYCHOSOMATISCHE MEDIZIN UND PSYCHOANALYSE, 1974, 20 (03): : 240 - 245
  • [34] FACTOR-ANALYSIS OF LEARNING DATA AND SELECTED ABILITY TEST-SCORES
    JONES, DL
    MULTIVARIATE BEHAVIORAL RESEARCH, 1976, 11 (01) : 79 - 94
  • [35] The consistency dimension and distribution-dependent learning from queries
    Balcázar, JL
    Castro, J
    Guijarro, D
    Simon, HU
    THEORETICAL COMPUTER SCIENCE, 2002, 288 (02) : 197 - 215
  • [36] Hybrid Consistency Training with Prototype Adaptation for Few-Shot Learning
    Ye, Meng
    Lin, Xiao
    Burachas, Giedrius
    Divakaran, Ajay
    Yao, Yi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2725 - 2734
  • [37] Analysis of the influence of the spraying angle on the distribution of sprinkling intensity by a selected Turbo water nozzle
    Piatek, Piotr
    Galaj, Jerzy
    FIRE AND ENVIRONMENTAL SAFETY ENGINEERING 2018 (FESE 2018), 2018, 247
  • [38] Digital Stereotypes in HMI-The Influence of Feature Quantity Distribution in Deep Learning Models Training
    Antonowicz, Pawel
    Podpora, Michal
    Rut, Joanna
    SENSORS, 2022, 22 (18)
  • [39] CONSISTENCY OF MORNING STIFFNESS - AN ANALYSIS OF DIARY DATA
    HAZES, JMW
    HAYTON, R
    BURT, J
    SILMAN, AJ
    BRITISH JOURNAL OF RHEUMATOLOGY, 1994, 33 (06): : 562 - 565
  • [40] Analysis of data replication with two levels of consistency
    Misra, M
    Mitrani, I
    IEEE INTERNATIONAL COMPUTER PERFORMANCE AND DEPENDABILITY SYMPOSIUM - IPDS'96, PROCEEDINGS, 1996, : 230 - 239