Rapid development in data science enables machine learning and artificial intelligence to be the most popular research tools across various disciplines. While numerous articles have shown decent predictive ability, little research has examined the impact of complex correlated data. We aim to develop a more accurate model under repeated measures or hierarchical data structures. Therefore, this study proposes a novel algorithm, the Generalized Estimating Equations Boosting (GEEB) machine, to integrate the gradient boosting technique into the benchmark statistical approach that deals with the correlated data, the generalized Estimating Equations (GEE). Unlike the previous gradient boosting utilizing all input features, we randomly select some input features when building the model to reduce predictive errors. The simulation study evaluates the predictive performance of the GEEB, GEE, eXtreme Gradient Boosting (XGBoost), and Support Vector Machine (SVM) across several hierarchical structures with different sample sizes. Results suggest that the new strategy GEEB outperforms the GEE and demonstrates superior predictive accuracy than the SVM and XGBoost in most situations. An application to a real-world dataset, the Forest Fire Data, also revealed that the GEEB reduced mean squared errors by 4.5% to 25% compared to GEE, XGBoost, and SVM. This research also provides a freely available R function that could implement the GEEB machine effortlessly for longitudinal or hierarchical data.
机构:
Chinese Univ Hong Kong, Nethersole Sch Nursing, Hong Kong, Hong Kong, Peoples R ChinaChinese Univ Hong Kong, Nethersole Sch Nursing, Hong Kong, Hong Kong, Peoples R China
Liu, Shan
论文数: 引用数:
h-index:
机构:
Dixon, Jane
Qiu, Guang
论文数: 0引用数: 0
h-index: 0
机构:Chinese Univ Hong Kong, Nethersole Sch Nursing, Hong Kong, Hong Kong, Peoples R China
Qiu, Guang
Tian, Yu
论文数: 0引用数: 0
h-index: 0
机构:
Yale Univ, Sch Publ Hlth, New Haven, CT 06520 USAChinese Univ Hong Kong, Nethersole Sch Nursing, Hong Kong, Hong Kong, Peoples R China
Tian, Yu
McCorkle, Ruth
论文数: 0引用数: 0
h-index: 0
机构:
Yale Univ, Sch Nursing, New Haven, CT 06520 USAChinese Univ Hong Kong, Nethersole Sch Nursing, Hong Kong, Hong Kong, Peoples R China
机构:
Aichi Med Univ, Adv Med Res Ctr, 1-1 Yazakokarimata, Nagakute, Aichi 4801195, Japan
Univ Tsukuba, Fac Med, Dept Clin Trial & Clin Epidemiol, 1-1-1 Tennodai, Tsukuba, Ibaraki 3058575, JapanAichi Med Univ, Adv Med Res Ctr, 1-1 Yazakokarimata, Nagakute, Aichi 4801195, Japan