Stratified Sampling Design Based on Data Mining

被引:6
|
作者
Kim, Yeonkook J. [1 ]
Oh, Yoonhwan [1 ]
Park, Sunghoon [2 ]
Cho, Sungzoon [2 ]
Park, Hayoung [1 ]
机构
[1] Seoul Natl Univ, Technol Management Econ & Policy Grad Program, 1 Gwanak Ro, Seoul 151742, South Korea
[2] Seoul Natl Univ, Dept Ind Engn, Seoul, South Korea
关键词
Sampling Studies; Decision Trees; Data Mining;
D O I
10.4258/hir.2013.19.3.186
中图分类号
R-058 [];
学科分类号
摘要
Objectives: To explore classification rules based on data mining methodologies which are to be used in defining strata in stratified sampling of healthcare providers with improved sampling efficiency. Methods: We performed k- means clustering to group providers with similar characteristics, then, constructed decision trees on cluster labels to generate stratification rules. We assessed the variance explained by the stratification proposed in this study and by conventional stratification to evaluate the performance of the sampling design. We constructed a study database from health insurance claims data and providers' profile data made available to this study by the Health Insurance Review and Assessment Service of South Korea, and population data from Statistics Korea. From our database, we used the data for single specialty clinics or hospitals in two specialties, general surgery and ophthalmology, for the year 2011 in this study. Results: Data mining resulted in five strata in general surgery with two stratification variables, the number of inpatients per specialist and population density of provider location, and five strata in ophthalmology with two stratification variables, the number of inpatients per specialist and number of beds. The percentages of variance in annual changes in the productivity of specialists explained by the stratification in general surgery and ophthalmology were 22% and 8%, respectively, whereas conventional stratification by the type of provider location and number of beds explained 2% and 0.2% of variance, respectively. Conclusions: This study demonstrated that data mining methods can be used in designing efficient stratified sampling with variables readily available to the insurer and government; it offers an alternative to the existing stratification method that is widely used in healthcare provider surveys in South Korea.
引用
收藏
页码:186 / 195
页数:10
相关论文
共 50 条
  • [21] Stochastic optimal design in multivariate stratified sampling
    Diaz-Garcia, Jose A.
    Ramos-Quiroga, Rogelio
    OPTIMIZATION, 2014, 63 (11) : 1665 - 1688
  • [22] Outcome Vector Dependent Sampling with Longitudinal Continuous Response Data: Stratified Sampling Based on Summary Statistics
    Schildcrout, Jonathan S.
    Garbett, Shawn P.
    Heagerty, Patrick J.
    BIOMETRICS, 2013, 69 (02) : 405 - 416
  • [23] Design of Fuzzy Controller Based on Data Mining
    Peng Xia
    Yuan Yan
    Cao Weihua
    Wu Min
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 3602 - 3606
  • [24] Design and Implementation of CRM Based on Data Mining
    Wang, Hui-ping
    Zhong, Ruo-wu
    MATERIALS SCIENCE AND INFORMATION TECHNOLOGY, PTS 1-8, 2012, 433-440 : 4463 - 4467
  • [25] Art Design Style Mining Based on Deep Learning and Data Mining
    Feng J.
    Wang Z.
    Computer-Aided Design and Applications, 2024, 21 (S19): : 33 - 47
  • [26] Adaptive Recommendation Method for Food Sampling Inspection Based on Data Mining
    Yuan, Guang
    2021 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE BIG DATA AND INTELLIGENT SYSTEMS (HPBD&IS), 2021, : 105 - 109
  • [27] Visual Analysis Graph Research of Food Sampling Data Based on Mining Data Relationship
    Yang L.
    Zhang X.
    Zheng L.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2019, 50 (06): : 272 - 279
  • [28] Stratified double sampling with continuous outcomes: Design and analysis
    Davidov, O
    Yu, C
    STATISTICS, 2002, 36 (02) : 163 - 173
  • [29] An optimum multivariate-multiobjective stratified sampling design
    Ansari A.H.
    Varshney R.
    Najmussehar
    Ahsan M.J.
    METRON, 2011, 69 (3) : 227 - 250
  • [30] Data splitting for artificial neural networks using SOM-based stratified sampling
    May, R. J.
    Maier, H. R.
    Dandy, G. C.
    NEURAL NETWORKS, 2010, 23 (02) : 283 - 294