A TREE-BASED APPROACH FOR ADDRESSING SELF-SELECTION IN IMPACT STUDIES WITH BIG DATA

被引:20
|
作者
Yahav, Inbal [1 ]
Shmueli, Galit [2 ]
Mani, Deepa [3 ]
机构
[1] Bar Ilan Univ, Grad Sch Business Adm, Dept Informat Syst, IL-52900 Ramat Gan, Israel
[2] Natl Tsing Hua Univ, Coll Technol Management, Inst Serv Sci, Hsinchu 30013, Taiwan
[3] Indian Sch Business, Hyderabad 500032, Andhra Pradesh, India
关键词
Self-selection; classification and regression trees; intervention; decision-making; e-governance; outsourcing; analytics; PROPENSITY SCORE ESTIMATION; ESTIMATION NEURAL-NETWORKS; MATCHING METHODS; REGRESSION; ALTERNATIVES; CONTRACTS; INFERENCE; PRIVACY; MARKET; BIAS;
D O I
10.25300/MISQ/2016/40.4.02
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different pre-intervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training on earnings, analysis of the impact of an electronic governance service in India based on a quasi-experiment, and performance comparison of contract pricing mechanisms and durations in IT outsourcing using observational data.
引用
收藏
页码:819 / +
页数:39
相关论文
共 50 条
  • [41] Tree-based generational feature selection in medical applications
    Paja, Wieslaw
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES 2019), 2019, 159 : 2172 - 2178
  • [42] BC Tree-based Proxy Graphs for Visualization of Big Graphs
    Hong, Seok-Hee
    Quan Nguyen
    Meidiana, Amyra
    Li, Jiaxi
    Eades, Peter
    [J]. 2018 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS), 2018, : 11 - 20
  • [43] Comparison of regression tree-based methods in genomic selection
    Sahar Ashoori-Banaei
    Farhad Ghafouri-Kesbi
    Ahmad Ahmadi
    [J]. Journal of Genetics, 2021, 100
  • [44] Data Mining with a Tree-Based Scan Statistic
    Brown, Jeffrey S.
    Dashevsky, Inna
    Fireman, Bruce
    Herrinton, Lisa
    McClure, David
    Murphy, Michael
    Raebel, Marsha
    Sturtevant, Jessica
    Kulldorff, Martin
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2011, 20 : S331 - S331
  • [45] A tree-based method of analysis for prospective studies
    Zhang, HP
    Holford, T
    Bracken, MB
    [J]. STATISTICS IN MEDICINE, 1996, 15 (01) : 37 - 49
  • [46] A NEW APPROACH TO BINARY TREE-BASED HEURISTICS
    ZUPAN, J
    [J]. ANALYTICA CHIMICA ACTA-COMPUTER TECHNIQUES AND OPTIMIZATION, 1980, 4 (04): : 337 - 346
  • [47] Analyzing and improving reliability: A tree-based approach
    Southern Methodist University, United States
    不详
    不详
    [J]. IEEE Software, 2 (97-104):
  • [48] Nondeterministic Approach to Tree-Based Jet Substructure
    Ellis, Stephen D.
    Hornig, Andrew
    Roy, Tuhin S.
    Krohn, David
    Schwartz, Matthew D.
    [J]. PHYSICAL REVIEW LETTERS, 2012, 108 (18)
  • [49] Analyzing and improving reliability: A tree-based approach
    Tian, J
    Palma, J
    [J]. IEEE SOFTWARE, 1998, 15 (02) : 97 - +
  • [50] A DECISION TREE-BASED APPROACH FOR CERVICAL SMEARS
    Shen, Ching-Cheng
    Yang, Hsu-Hao
    Chang, Yueh-Ching
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2012, 8 (5A): : 3251 - 3263