Analysis of Learning Influence of Training Data Selected by Distribution Consistency

被引:2
|
作者
Hwang, Myunggwon [1 ,2 ]
Jeong, Yuna [1 ]
Sung, Won-Kyung [1 ,2 ]
机构
[1] Korea Inst Sci & Technol Informat, Intelligent Infrastruct Technol Res Ctr, Daejeon 34141, South Korea
[2] Univ Sci & Technol, Dept Data & HPC Sci, Daejeon 34113, South Korea
关键词
learning influence; machine learning; training data similarity; distribution consistency;
D O I
10.3390/s21041045
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
This study suggests a method to select core data that will be helpful for machine learning. Specifically, we form a two-dimensional distribution based on the similarity of the training data and compose grids with fixed ratios on the distribution. In each grid, we select data based on the distribution consistency (DC) of the target class data and examine how it affects the classifier. We use CIFAR-10 for the experiment and set various grid ratios from 0.5 to 0.005. The influences of these variables were analyzed with the use of different training data sizes selected based on high-DC, low-DC (inverse of high DC), and random (no criteria) selections. As a result, the average point accuracy at 0.95% (+/- 0.65) and the point accuracy at 1.54% (+/- 0.59) improved for the grid configurations of 0.008 and 0.005, respectively. These outcomes justify an improved performance compared with that of the existing approach (data distribution search). In this study, we confirmed that the learning performance improved when the training data were selected for very small grid and high-DC settings.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [21] CGT: Consistency Guided Training in Semi-Supervised Learning
    Hasan, Nesreen
    Ghorban, Farzin
    Velten, Joerg
    Kummert, Anton
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 55 - 64
  • [22] Explanation Consistency Training: Facilitating Consistency-Based Semi-Supervised Learning with Interpretability
    Han, Tao
    Tu, Wei-Wei
    Li, Yu-Feng
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7639 - 7646
  • [23] Analyses on Influence of Training Data Set to Neural Network Supervised Learning Performance
    Zhou, Yu
    Wu, Yali
    ADVANCES IN COMPUTER SCIENCE, INTELLIGENT SYSTEM AND ENVIRONMENT, VOL 3, 2011, 106 : 19 - 25
  • [24] Clustering Consistency in Neuroimaging Data Analysis
    Liu, Chao
    Abu-Jamous, Basel
    Brattico, Elvira
    Nandi, Asoke
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1118 - 1122
  • [25] Consistency in ordinal data analysis I
    Herden, G
    Pallack, A
    MATHEMATICAL SOCIAL SCIENCES, 2002, 43 (01) : 79 - 113
  • [26] Comparative analysis of the performance of selected machine learning algorithms depending on the size of the training sample
    Kupidura, Przemyslaw
    Kepa, Agnieszka
    Krawczyk, Piotr
    REPORTS ON GEODESY AND GEOINFORMATICS, 2024, 118 (01) : 53 - 69
  • [27] Deep Learning Based Vehicle Detection on Real and Synthetic Aerial Images: Training Data Composition and Statistical Influence Analysis
    Krump, Michael
    Stuetz, Peter
    SENSORS, 2023, 23 (07)
  • [28] Consistency of Lipschitz Learning with Infinite Unlabeled Data and Finite Labeled Data
    Calder, Jeff
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (04): : 780 - 812
  • [29] Influence Analysis of Linear Data Distribution on Different Clustering Results
    Song, Yu-Chen
    Jia, Xiao-Liang
    Meng, Hai-Dong
    ADVANCED MANUFACTURING TECHNOLOGY, PTS 1-4, 2012, 472-475 : 3144 - 3152
  • [30] Consistency of ablations with trainee and increasing independence during fellowship training-Analysis of ablation data by CARTONET
    Whitaker, John
    Hunter, Tina D.
    Carsey, Jane
    Thatcher, William H.
    Yungher, Don
    Goldberg, Stanislav
    Kaneko, Christina
    Amit, Mati
    Kreidieh, Omar
    Thurber, Clinton
    Steiger, Nathaniel
    Chang, David
    Batnyam, Uyanga
    Sharma, Esseim
    McClennen, Seth
    Kapur, Sunil
    Tadros, Thomas
    Sauer, William H.
    Koplan, Bruce
    Tedrow, Usha
    Zei, Paul C.
    JOURNAL OF CARDIOVASCULAR ELECTROPHYSIOLOGY, 2024, 35 (08) : 1645 - 1655