共 50 条
A study of re-sampling methods with regression modeling
被引:0
|作者:
Hossain, MA
[1
]
Woodburn, RL
[1
]
机构:
[1] Blue Hawk, LLC, Dhaka, Bangladesh
来源:
关键词:
D O I:
暂无
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
There are an overwhelming number of applications of data mining that result in the use of regression models. For example, predicting the propensity of a customer to default on a credit card, or the likelihood that a prospect will respond to a direct marketing campaign. Unfortunately, the implementation constraints for many such useful applications restrict the type of predictive method used to simple linear or logistic regression. While the more sophisticated techniques (e.g Neural Nets [I]) have built in processes that make the resulting model the most predictive and robust, developing a robust linear/logistic regression model requires much care with an experienced hand. In business settings, most predictive models are built on a modeling data set and independently validated on a validation dataset. Often times, the modeling and validation data set have differences that cause the modeler to question whether the model will perform well in the future. This paper explores the use of resampling methods in the model building steps to help to build an optimal sample that not only fits both the modeling and validation sample well, but also holds up robustly. The resampling allows many more sample datasets to be considered and eliminates overfilling of the model sample.
引用
收藏
页码:83 / 91
页数:9
相关论文