ADAPTIVE-CLUSTERING BASED METHOD TO ESTIMATE NULL VALUES IN RELATIONAL DATABASES

被引:0
|
作者
Cheng, Ching-Hsue [2 ]
Chang, Jing-Rong [1 ]
Wei, Liang-Ying [3 ]
机构
[1] Chaoyang Univ Technol, Dept Informat Management, Wufong Township 41349, Taichung County, Taiwan
[2] Natl Yunlin Univ Sci & Technol, Dept Informat Management, Touliu 640, Yunlin, Taiwan
[3] Yuanpei Univ, Dept Informat Management, Hsinchu 30015, Taiwan
关键词
Relational database systems; Null value; Degree of influential; K-means; Adaptive learning; FUZZY RULES; SYSTEMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data preprocessing is an essential step of knowledge discovery. Data preprocessing comprises data cleaning, data integration, data transformation, data reduction and data discretization. Estimating null values is a task of data cleaning. Null values in a database are significant sources of poor data quality. Therefore, the appropriate handling of null values is an important task of data preprocessing in relational databases. We propose a new method that uses adaptive learning techniques, based on clustering, to resolve the issue of null values in relational database systems. This study uses clustering algorithms to group data and calculates the degree of influence between independent attributes (variables) and the dependent attribute through an adaptive learning method (the best adaptive parameter can be obtained by the minimum average error rate). Three databases (a human resource database, Waugh's database and a government salary study database) were selected as the experimental data to compare the mean absolute error rate (MAER) of the proposed algorithm with the other methods. The results demonstrate that the proposed method outperforms other methods.
引用
收藏
页码:223 / 235
页数:13
相关论文
共 50 条
  • [21] A Possibility-theory-based Model for Relational Databases containing Uncertain Attribute Values
    Li, Li
    FIFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2012): ALGORITHMS, PATTERN RECOGNITION AND BASIC TECHNOLOGIES, 2013, 8784
  • [22] A logical design method for relational databases based on generalization and aggregation semantics
    Weiyi Liu
    Hong Yao
    Journal of Computer Science and Technology, 1997, 12 (3) : 252 - 262
  • [23] A Logical Design Method for Relational Databases Based on Generalization and Aggregation Semantics
    刘惟一
    姚弘
    Journal of Computer Science and Technology, 1997, (03) : 252 - 262
  • [24] A Rough Set based Data Inconsistency Checking Method for Relational Databases
    Sug, Hyontai
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (11): : 103 - 106
  • [25] A Novel Trend Relational Algorithm Based on Fuzzy Clustering Method
    Chen, Weizhen
    Zhou, Long
    Yuan, Jichao
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 1095 - 1098
  • [26] ADAPTIVE NULL OPTIMIZATION METHOD BASED ON FREQUENCY DIVERSE ARRAY
    Li, Siqi
    Zong, Zhulin
    Huang, Libing
    Feng, Yun
    International Geoscience and Remote Sensing Symposium (IGARSS), 2021, : 5012 - 5015
  • [27] Null Values Estimation Method Based on Predictions in Incomplete Information Systems
    Jiang, Yanji
    Jiang, Ze
    Huang, Fenggang
    PROCEEDINGS OF INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND COMPUTATIONAL TECHNOLOGY (ISCSCT 2009), 2009, : 335 - 338
  • [28] A Method for Traffic Congestion Clustering Judgment Based on Grey Relational Analysis
    Zhang, Yingya
    Ye, Ning
    Wang, Ruchuan
    Malekian, Reza
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2016, 5 (05):
  • [29] An adaptive spark-based framework for querying large-scale NoSQL and relational databases
    Khashan, Eman
    Eldesouky, Ali
    Elghamrawy, Sally
    PLOS ONE, 2021, 16 (08):
  • [30] Clustering method for panel data based on adaptive weighting
    Li, Y.-G., 2013, Systems Engineering Society of China (33):