Missing Data Imputation for Geolocation-based Price Prediction Using KNN MCF Method

被引:27
|
作者
Sanjar, Karshiev [1 ]
Bekhzod, Olimov [1 ]
Kim, Jaesoo [1 ]
Paul, Anand [1 ]
Kim, Jeonghong [1 ]
机构
[1] Kyungpook Natl Univ, Sch Comp Sci & Engn, Daegu 41566, South Korea
关键词
house price prediction; handling missing data; random forest; SELECTION;
D O I
10.3390/ijgi9040227
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Accurate house price forecasts are very important for formulating national economic policies. In this paper, we offer an effective method to predict houses' sale prices. Our algorithm includes one-hot encoding to convert text data into numeric data, feature correlation to select only the most correlated variables, and a technique to overcome the missing data. Our approach is an effective way to handle missing data in large datasets with the K-nearest neighbor algorithm based on the most correlated features (KNN-MCF). As far as we are concerned, there has been no previous research that has focused on important features dealing with missing observations. Compared to the typical machine learning prediction algorithms, the prediction accuracy of the proposed method is 92.01% with the random forest algorithm, which is more efficient than the other methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Learning-Based Adaptive Imputation Method with kNN Algorithm for Missing Power Data
    Kim, Minkyung
    Park, Sangdon
    Lee, Joohyung
    Joo, Yongjae
    Choi, Jun Kyun
    [J]. ENERGIES, 2017, 10 (10)
  • [2] Imputation of missing precipitation data using KNN, SOM, RF, and FNN
    Abinash Sahoo
    Dillip Kumar Ghose
    [J]. Soft Computing, 2022, 26 : 5919 - 5936
  • [3] Imputation Method of Missing Values for Dissolved Gas Analysis Data Based on Iterative KNN and XGBoost
    Qiao, Lin
    Ran, Ran
    Wu, He
    Zhou, Qiaoni
    Liu, Sai
    Liu, Yunfei
    [J]. 2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [4] Imputation of missing precipitation data using KNN, SOM, RF, and FNN
    Sahoo, Abinash
    Ghose, Dillip Kumar
    [J]. SOFT COMPUTING, 2022, 26 (12) : 5919 - 5936
  • [5] Cluster-based KNN Missing Value Imputation for DNA Microarray Data
    Keerin, Phimmarin
    Kurutach, Werasak
    Boongoen, Tossapon
    [J]. PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 445 - 450
  • [6] Improved KNN Imputation for Missing Values in Gene Expression Data
    Keerin, Phimmarin
    Boongoen, Tossapon
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (02): : 4009 - 4025
  • [7] KNN-DTW Based Missing Value Imputation for Microarray Time Series Data
    Hsu, Hui-Huang
    Yang, Andy C.
    Lu, Ming-Da
    [J]. JOURNAL OF COMPUTERS, 2011, 6 (03) : 418 - 425
  • [8] Iterative KNN Imputation Based on GRA for Missing Values in TPLMS
    Zhu, Ming
    Cheng, Xingbing
    [J]. PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 94 - 99
  • [9] Missing data completion method based on KNN and Random Forest
    Zhang, Songyu
    Zhou, Yuchen
    Yan, Jinghua
    Bu, Fanliang
    [J]. SECOND IYSF ACADEMIC SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND COMPUTER ENGINEERING, 2021, 12079
  • [10] Dimensional Data KNN-Based Imputation
    Yang, Yuzhao
    Darmont, Jerome
    Ravat, Franck
    Teste, Olivier
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 13389 : 315 - 329