A scalable nonparametric specification testing for massive data

被引:4
|
作者
Zhao, Yanyan [1 ,2 ]
Zou, Changliang [1 ,2 ]
Wang, Zhaojun [1 ,2 ]
机构
[1] Nankai Univ, Inst Stat, Tianjin, Peoples R China
[2] Nankai Univ, LPMC, Tianjin, Peoples R China
关键词
Adaptive test; Asymptotic normality; Lack-of-fit test; Rate-optimal; Sample-splitting method; OF-FIT TESTS; REGRESSION-CURVES; FUNCTIONAL FORM; CONSISTENT TEST; MODEL; SELECTION; EQUALITY; RATES;
D O I
10.1016/j.jspi.2018.09.012
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Lack-of-fit checking for parametric models is essential in reducing misspecification. However, for massive data sets which are increasingly prevalent, classical tests become prohibitively costly in computation and their feasibility is questionable even with modern parallel computing platforms. Building on the divide and conquer strategy, we propose a new nonparametric testing method, that is fast to compute and easy to implement with only one tuning parameter determined by a given time budget. Under mild conditions, we show that the proposed test statistic is asymptotically equivalent to that based on the whole data. Benefiting from using the sample-splitting idea for choosing the smoothing parameter, the proposed test is able to retain the type-I error rate pretty well with asymptotic distributions and achieves adaptive rate-optimal detection properties. Its advantage relative to existing methods is also demonstrated in numerical simulations and a data illustration. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:161 / 175
页数:15
相关论文
共 50 条
  • [21] Scalable Bootstrap Clustering for Massive Data
    Wang, Haocheng
    Zhuang, Fuzhen
    Ao, Xiang
    He, Qing
    Shi, Zhongzhi
    2014 15TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2014, : 123 - 128
  • [22] Scalable Strategies for Computing with Massive Data
    Kane, Michael J.
    Emerson, John W.
    Weston, Stephen
    JOURNAL OF STATISTICAL SOFTWARE, 2013, 55 (14): : 1 - 19
  • [23] Scalable Splitting of Massive Data Streams
    Zeitler, Erik
    Risch, Tore
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, PROCEEDINGS, 2010, 5982 : 184 - 198
  • [24] Specification testing of discrete choice models: a note on the use of a nonparametric test
    Fosgerau, Mogens
    JOURNAL OF CHOICE MODELLING, 2008, 1 (01) : 26 - 39
  • [25] NONPARAMETRIC METHODS IN SPECIFICATION
    ROBINSON, PM
    ECONOMIC JOURNAL, 1986, 96 : 134 - 141
  • [26] Scalable bootstrap attribute reduction for massive data
    Ji S.
    Shi H.
    Lv Y.
    Guo M.
    Ji, Suqin (jsq58@sina.com), 2018, Inderscience Publishers, 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (12) : 410 - 417
  • [27] Nonparametric Bayesian Extraction of Object Configurations in Massive Data
    Meillier, Celine
    Chatelain, Florent
    Michel, Olivier
    Ayasso, Hacheme
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2015, 63 (08) : 1911 - 1924
  • [28] Nonparametric estimation and specification testing of a two-factor interest rate model
    Thompson, Brennan S.
    ECONOMICS BULLETIN, 2009, 29 (03): : 2343 - 2349
  • [29] Lag selection and model specification testing in nonparametric autoregressive conditional heteroscedastic models
    Zambom, Adriano Z.
    Kim, Seonjin
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2017, 186 : 13 - 27
  • [30] A data-driven nonparametric specification test for dynamic regression models
    Guay, Alain
    Guerre, Emmanuel
    ECONOMETRIC THEORY, 2006, 22 (04) : 543 - 586