ADAPTIVE-CLUSTERING BASED METHOD TO ESTIMATE NULL VALUES IN RELATIONAL DATABASES

被引:0
|
作者
Cheng, Ching-Hsue [2 ]
Chang, Jing-Rong [1 ]
Wei, Liang-Ying [3 ]
机构
[1] Chaoyang Univ Technol, Dept Informat Management, Wufong Township 41349, Taichung County, Taiwan
[2] Natl Yunlin Univ Sci & Technol, Dept Informat Management, Touliu 640, Yunlin, Taiwan
[3] Yuanpei Univ, Dept Informat Management, Hsinchu 30015, Taiwan
关键词
Relational database systems; Null value; Degree of influential; K-means; Adaptive learning; FUZZY RULES; SYSTEMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data preprocessing is an essential step of knowledge discovery. Data preprocessing comprises data cleaning, data integration, data transformation, data reduction and data discretization. Estimating null values is a task of data cleaning. Null values in a database are significant sources of poor data quality. Therefore, the appropriate handling of null values is an important task of data preprocessing in relational databases. We propose a new method that uses adaptive learning techniques, based on clustering, to resolve the issue of null values in relational database systems. This study uses clustering algorithms to group data and calculates the degree of influence between independent attributes (variables) and the dependent attribute through an adaptive learning method (the best adaptive parameter can be obtained by the minimum average error rate). Three databases (a human resource database, Waugh's database and a government salary study database) were selected as the experimental data to compare the mean absolute error rate (MAER) of the proposed algorithm with the other methods. The results demonstrate that the proposed method outperforms other methods.
引用
收藏
页码:223 / 235
页数:13
相关论文
共 50 条
  • [31] An adaptive particle swarm optimization method based on clustering
    Xiaolei Liang
    Wenfeng Li
    Yu Zhang
    MengChu Zhou
    Soft Computing, 2015, 19 : 431 - 448
  • [32] An adaptive particle swarm optimization method based on clustering
    Liang, Xiaolei
    Li, Wenfeng
    Zhang, Yu
    Zhou, MengChu
    SOFT COMPUTING, 2015, 19 (02) : 431 - 448
  • [33] A blind reversible method for watermarking relational databases based on a time-stamping protocol
    Farfoura, Mahmoud E.
    Horng, Shi-Jinn
    Lai, Jui-Lin
    Run, Ray-Shine
    Chen, Rong-Jian
    Khan, Muhammad Khurram
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (03) : 3185 - 3196
  • [34] Clustering by Sorting Potential Values (CSPV): A novel potential-based clustering method
    Lu, Yonggang
    Wan, Yi
    PATTERN RECOGNITION, 2012, 45 (09) : 3512 - 3522
  • [35] An adaptive and efficient clustering-based approach for content-based image retrieval in image databases
    Stehling, RO
    Nascimento, MA
    Falcao, AX
    2001 INTERNATIONAL DATABASE ENGINEERING & APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2001, : 356 - 365
  • [36] Null values estimation method based on rough set for incomplete information systems
    Li, Cong
    Liang, Chang-Yong
    Yang, Shan-Lin
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2009, 15 (03): : 604 - 608
  • [37] Efficient k -nearest neighbor search based on clustering and adaptive k values
    Gallego, Antonio Javier
    Rico-Juan, Juan Ramon
    Valero-Mas, Jose J.
    PATTERN RECOGNITION, 2022, 122
  • [38] Missing value imputation method based on density clustering and grey relational analysis
    Peng, Li
    Ting-Ting, Zhang
    Tian-Ge, Liang
    Kai-Hui, Zhang
    International Journal of Multimedia and Ubiquitous Engineering, 2015, 10 (11): : 133 - 142
  • [39] An Analogical Reasoning Method Based on Multi-task Learning with Relational Clustering
    Li, Shuyi
    Wu, Shaojuan
    Zhang, Xiaowang
    Feng, Zhiyong
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 144 - 147
  • [40] A method of relational fuzzy clustering based on producing feature vectors using FastMap
    Brouwer, Roelof Kars
    INFORMATION SCIENCES, 2009, 179 (20) : 3561 - 3582