Data mining and the impact of missing data

被引:95
|
作者
Brown, ML [1 ]
Kros, JF
机构
[1] Hawaii Pacific Univ, Sch Business, Honolulu, HI USA
[2] E Carolina Univ, Dept Decis Sci, Greenville, NC USA
关键词
data handling; database management systems; information gathering; information retrieval;
D O I
10.1108/02635570310497657
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The actual data mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Therefore, the significance of the analysis depends heavily on the accuracy of the database and on the chosen sample data to be used for model training and testing. Data mining is based upon searching the concatenation of multiple databases that usually contain some amount of missing data along with a variable percentage of inaccurate data, pollution, outliers and noise. The issue of missing data must be addressed since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of missing data on the data mining process.
引用
收藏
页码:611 / 621
页数:11
相关论文
共 50 条
  • [21] Missing data as data
    Basiri, Anahid
    Brunsdon, Chris
    PATTERNS, 2022, 3 (09):
  • [22] The Impact of Missing/Incomplete Data in Real-World Data Studies
    Yang, D. X.
    Miccio, J. A.
    Jairam, V.
    Chang, E.
    Yu, J. B.
    Park, H. S. M.
    Aneja, S.
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2020, 108 (03): : E394 - E394
  • [23] Impact of missing data on the prediction of random fields
    Hamaz, Abdelghani
    Arezki, Ouerdia
    Achemine, Farida
    JOURNAL OF APPLIED STATISTICS, 2020, 47 (01) : 132 - 149
  • [24] The Impact of Missing Data on Species Tree Estimation
    Xi, Zhenxiang
    Liu, Liang
    Davis, Charles C.
    MOLECULAR BIOLOGY AND EVOLUTION, 2016, 33 (03) : 838 - 860
  • [25] The Impact of Missing Background Data on Subpopulation Estimation
    Rutkowski, Leslie
    JOURNAL OF EDUCATIONAL MEASUREMENT, 2011, 48 (03) : 293 - 312
  • [26] The impact of missing data on the results of a schizophrenia study
    Rybin, Denis
    Doros, Gheorghe
    Rosenheck, Robert
    Lew, Robert
    PHARMACEUTICAL STATISTICS, 2015, 14 (01) : 4 - 10
  • [27] Missing data and its impact on the CCCTB determination
    Nerudova, Danuse
    Solilova, Veronika
    17TH INTERNATIONAL CONFERENCE ENTERPRISE AND COMPETITIVE ENVIRONMENT 2014, 2014, 12 : 462 - 471
  • [28] Impact of data mining in drought monitoring
    Rajput, Anil
    Soni, Ritu
    Aharwal, Ramesh Prasad
    Sharma, Rajesh
    International Journal of Computer Science Issues, 2011, 8 (6 6-2): : 309 - 313
  • [29] Impact of Data Mining Techniques to Analyze Health Care Data
    Saeed, Soobia
    Shaikh, Asadullah
    Memon, Muhammad Ali
    Naqvi, Syed Mehmood Raza
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2018, 8 (04) : 682 - 690
  • [30] Review and big data perspectives on robust data mining approaches for industrial process modeling with outliers and missing data
    Zhu, Jinlin
    Ge, Zhiqiang
    Song, Zhihuan
    Gao, Furong
    ANNUAL REVIEWS IN CONTROL, 2018, 46 : 107 - 133