Housing Price Prediction by Divided Regression Analysis

被引:0
|
作者
Goh, Yann Ling [1 ]
Goh, Yeh Huann [2 ]
Yip, Chun-Chieh [1 ]
Ng, Kooi Huat [1 ]
机构
[1] Univ Tunku Abdul Rahman, Lee Kong Chian Fac Engn & Sci, Jalan Sungai Long, Kajang 43000, Selangor, Malaysia
[2] Kolej Univ Tunku Abdul Rahman, Fac Engn, Jalan Genting Kelang, Kuala Lumpur 53300, Malaysia
来源
CHIANG MAI JOURNAL OF SCIENCE | 2022年 / 49卷 / 06期
关键词
divided regression; multicollinearity; big data; BIG DATA; DECISION-MAKING; CHALLENGES;
D O I
10.12982/CMJS.2022.102
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Regression analysis is a statistical methodology to investigate the relationship between the dependent variable and the independent variables. In current era with the trend of big data, we might face some problems when performing statistical analysis for the massive volume of data. For example, the heavy burden of the computing load will cause the computation to be time consuming, the accuracy of the results might be affected in view of the vast volume of data. Hence, divided regression analysis is proposed to reduce the burden of the computing load. This approach performs subdivision of the dataset into several unique subsets, then the multiple linear regression is fitted into each subset. The results obtained from each subset are then combined to obtain a divided regression model which is treated as the original overall dataset. The dataset used in this paper is KC Housesales Data, obtained from the Kaggle website. The dataset contains statistics information about the housing price, for example, size of lot, size of living area and selling price of the house. The goal of this paper is to predict the selling price of a house from the given attributes. The dataset is partitioned into five subsets. Consequently, multiple linear regression is fitted for each subset. Then, some model adequacy checking will be applied on the models. The test in determining the existence of multicollinearity in the models is rather important as well because the collinearity among the independent variables will affect the overall results. Hence, the variance inflation factor (VIF) approach is used to determine the existence of multicollinearity. Finally, the divided regression model is obtained by combining results from all the subsets and the validity of divided regression model is verified.
引用
收藏
页码:1669 / 1682
页数:14
相关论文
共 50 条
  • [21] Spatial dependence, housing submarkets, and house price prediction
    Bourassa, Steven C.
    Cantoni, Eva
    Hoesh, Martin
    JOURNAL OF REAL ESTATE FINANCE AND ECONOMICS, 2007, 35 (02): : 143 - 160
  • [22] Spatial Dependence, Housing Submarkets, and House Price Prediction
    Steven C. Bourassa
    Eva Cantoni
    Martin Hoesli
    The Journal of Real Estate Finance and Economics, 2007, 35 : 143 - 160
  • [23] Application of the Grey System in Commercial Housing Price Prediction
    Li, Zhuoshi
    Ye, Yangyang
    Fu, Ming
    Zhu, Fugui
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON ENGINEERING MANAGEMENT, ENGINEERING EDUCATION AND INFORMATION TECHNOLOGY, 2015, 36 : 192 - 195
  • [24] Housing Price Forecastability: A Factor Analysis
    Bork, Lasse
    Moller, Stig V.
    REAL ESTATE ECONOMICS, 2018, 46 (03) : 582 - 611
  • [25] Analysis on Housing Price Characteristics in Nanjing
    Chen Yongxia
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON CONSTRUCTION & REAL ESTATE MANAGEMENT, VOLS 1 AND 2, 2008, : 1360 - 1363
  • [26] Housing market hedonic price study based on boosting regression tree
    Gu G.
    Xu B.
    J. Adv. Comput. Intell. Intelligent Informatics, 6 (1040-1047): : 1040 - 1047
  • [27] Study on Dynamic Relation between Share Price Index and Housing Price: Co-integration Analysis and Application in Share Price Index Prediction
    Peng, Jin
    SIXTH INTERNATIONAL SYMPOSIUM ON NEURAL NETWORKS (ISNN 2009), 2009, 56 : 837 - 846
  • [28] The Prediction Model of Residential Housing-Price Tolerability Based on Fuzzy Matrix Analysis Theory
    Jiang, Feng-Guang
    Chen, Li-Wen
    PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND MANAGEMENT INNOVATION, 2014, : 182 - 190
  • [29] Regression and Hidden Markov Models for Gold Price Prediction
    Shen, Li
    Shen, Kun
    Yi, Chao
    Chen, Yixin
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5451 - 5456
  • [30] Hydropower Price Prediction with the Nonparametric Statistics Regression Model
    Li, Jiaojiao
    Zhao, Linfeng
    JOURNAL OF COASTAL RESEARCH, 2020, : 402 - 405