A sliced inverse regression approach for data stream

被引:14
|
作者
Chavent, Marie [1 ,2 ]
Girard, Stephane [3 ]
Kuentz-Simonet, Vanessa [4 ]
Liquet, Benoit [5 ,6 ]
Thi Mong Ngoc Nguyen [7 ]
Saracco, Jerome [1 ,2 ]
机构
[1] Univ Bordeaux, Inst Math Bordeaux, UMR CNRS 5251, F-33405 Talence, France
[2] Inria Bordeaux Sud Ouest, CQFD Team, Talence, France
[3] Inria Grenoble Rhone Alpes, MISTIS Team, LJK, F-38334 Montbonnot St Martin, St Ismier, France
[4] IRSTEA, Unite ADBX Amenites & Dynam Espaces Ruraux, F-33612 Gazinet, Cestas, France
[5] Univ Bordeaux, ISPED, Ctr INSERM Epidemiol Biostat U897, F-33000 Bordeaux, France
[6] INSERM, ISPED, Ctr INSERM Epidemiol Biostat U897, F-33000 Bordeaux, France
[7] Univ Strasbourg, IRMA, UMR 7501, F-67084 Strasbourg, France
关键词
Effective dimension reduction (EDR); Sliced inverse regression (SIR); Data stream; DIMENSION; SIR;
D O I
10.1007/s00180-014-0483-4
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we focus on data arriving sequentially by blocks in a stream. A semiparametric regression model involving a common effective dimension reduction (EDR) direction is assumed in each block. Our goal is to estimate this direction at each arrival of a new block. A simple direct approach consists of pooling all the observed blocks and estimating the EDR direction by the sliced inverse regression (SIR) method. But in practice, some disadvantages appear such as the storage of the blocks and the running time for large sample sizes. To overcome these drawbacks, we propose an adaptive SIR estimator of based on the optimization of a quality measure. The corresponding approach is faster both in terms of computational complexity and running time, and provides data storage benefits. The consistency of our estimator is established and its asymptotic distribution is given. An extension to multiple indices model is proposed. A graphical tool is also provided in order to detect changes in the underlying model, i.e., drift in the EDR direction or aberrant blocks in the data stream. A simulation study illustrates the numerical behavior of our estimator. Finally, an application to real data concerning the estimation of physical properties of the Mars surface is presented.
引用
收藏
页码:1129 / 1152
页数:24
相关论文
共 50 条
  • [1] A sliced inverse regression approach for data stream
    Marie Chavent
    Stéphane Girard
    Vanessa Kuentz-Simonet
    Benoit Liquet
    Thi Mong Ngoc Nguyen
    Jérôme Saracco
    [J]. Computational Statistics, 2014, 29 : 1129 - 1152
  • [2] Sliced inverse regression for survival data
    Maya Shevlyakova
    Stephan Morgenthaler
    [J]. Statistical Papers, 2014, 55 : 209 - 220
  • [3] Sliced inverse regression for survival data
    Shevlyakova, Maya
    Morgenthaler, Stephan
    [J]. STATISTICAL PAPERS, 2014, 55 (01) : 209 - 220
  • [4] BIG-SIR a Sliced Inverse Regression approach for massive data
    Liquet, Benoit
    Saracco, Jerome
    [J]. STATISTICS AND ITS INTERFACE, 2016, 9 (04) : 509 - 520
  • [5] A Sliced Inverse Regression Approach for a Stratified Population
    Chavent, Marie
    Kuentz, Vanessa
    Liquet, Benoit
    Saracco, Jerome
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (21) : 3857 - 3878
  • [6] Iterative projection of sliced inverse regression with fused approach
    Han, Hyoseon
    Cho, Youyoung
    Yoo, Jae Keun
    [J]. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2021, 28 (02) : 205 - 215
  • [7] Sparse sliced inverse regression
    Li, Lexin
    Nachtsheim, Christopher J.
    [J]. TECHNOMETRICS, 2006, 48 (04) : 503 - 510
  • [8] Collaborative sliced inverse regression
    Chiancone, Alessandro
    Girard, Stephane
    Chanussot, Jocelyn
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (12) : 6035 - 6053
  • [9] Sliced inverse regression with regularizations
    Li, Lexin
    Yin, Xiangrong
    [J]. BIOMETRICS, 2008, 64 (01) : 124 - 131
  • [10] ASYMPTOTICS OF SLICED INVERSE REGRESSION
    ZHU, LX
    NG, KW
    [J]. STATISTICA SINICA, 1995, 5 (02) : 727 - 736