Spatial Rank-Based Augmentation for Nonparametric Online Monitoring and Adaptive Sampling of Big Data Streams

被引:6
|
作者
Zan, Xin [1 ]
Wang, Di [2 ]
Xian, Xiaochen [1 ]
机构
[1] Univ Florida, Dept Ind & Syst Engn, Gainesville, FL 32611 USA
[2] Shanghai Jiao Tong Univ, Sch Mech Engn, Dept Ind Engn & Management, Shanghai, Peoples R China
基金
美国国家科学基金会; 上海市自然科学基金; 中国国家自然科学基金;
关键词
Data augmentation; Distribution-free; Internet of Things (IoT); Partial observations; Statistical process control (SPC); CONTROL CHARTS; MEAN VECTOR; THINGS IOT; INTERNET;
D O I
10.1080/00401706.2022.2143903
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The age of Internet of Things (IoT) has witnessed the rapid development of modern data acquisition devices and communicating-actuating networks, which enables the generation of big data streams shared across platforms for remote and efficient decision making of many critical systems. The monitoring of big data streams remains a challenging task in various practical applications mainly due to their complexity in interrelationships, large volume, and high velocity, which places prohibitive demands on monitoring methodologies and resources. To tackle the challenges of monitoring unexchangeable and correlated big data streams with only partial observations available under resource constraints, we propose a method by incorporating spatial rank-based statistics with effective data augmentation techniques for the online unobservable data streams that can analytically inform the monitoring and sampling decisions based only on partially observed data streams. By exploiting historical data, the proposed method preserves strong descriptive power of general big data streams under partial observations and can explicitly use the correlation among data streams, and thus allows effective monitoring and equitable sampling over general heterogeneous and correlated big data streams, which is free of simplified assumptions (e.g., exchangeability) compared to existing methods. Theoretical investigations are carried out to evaluate the effectiveness of the augmentation statistics as well as the sampling strategy, which guarantee the superiority of the sampling performance over existing methods. Simulations under various scenarios and two real case studies are also conducted to evaluate and validate the performance of the proposed method.
引用
收藏
页码:243 / 256
页数:14
相关论文
共 50 条
  • [1] Online monitoring of big data streams: A rank-based sampling algorithm by data augmentation
    Xian, Xiaochen
    Zhang, Chen
    Bonk, Scott
    Liu, Kaibo
    JOURNAL OF QUALITY TECHNOLOGY, 2021, 53 (02) : 135 - 153
  • [2] A Nonparametric Adaptive Sampling Strategy for Online Monitoring of Big Data Streams
    Xian, Xiaochen
    Wang, Andi
    Liu, Kaibo
    2017 13TH IEEE CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2017, : 844 - 846
  • [3] A Nonparametric Adaptive Sampling Strategy for Online Monitoring of Big Data Streams
    Xian, Xiaochen
    Wang, Andi
    Liu, Kaibo
    TECHNOMETRICS, 2018, 60 (01) : 14 - 25
  • [4] A spatial-adaptive sampling procedure for online monitoring of big data streams
    Wang, Andi
    Xian, Xiaochen
    Tsung, Fugee
    Liu, Kaibo
    JOURNAL OF QUALITY TECHNOLOGY, 2018, 50 (04) : 329 - 343
  • [5] Online nonparametric monitoring of heterogeneous data streams with partial observations based on Thompson sampling
    Ye, Honghan
    Xian, Xiaochen
    Cheng, Jing-Ru C.
    Hable, Brock
    Shannon, Robert W.
    Elyaderani, Mojtaba Kadkhodaie
    Liu, Kaibo
    IISE TRANSACTIONS, 2023, 55 (04) : 392 - 404
  • [6] A Sequential Rank-Based Nonparametric Adaptive EWMA Control Chart
    Liu, Liu
    Zi, Xuemin
    Zhang, Jian
    Wang, Zhaojun
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2013, 42 (04) : 841 - 859
  • [7] Rank-Based Inference for Survey Sampling Data
    Adekpedjou, Akim
    Bindele, Huybrechts F.
    JOURNAL OF SURVEY STATISTICS AND METHODOLOGY, 2023, 11 (02) : 412 - 432
  • [9] Nonparametric rank-based statistics and significance tests for fuzzy data
    Denoeux, T
    Masson, MH
    Hébert, PA
    FUZZY SETS AND SYSTEMS, 2005, 153 (01) : 1 - 28
  • [10] A spatial rank-based EWMA chart for monitoring linear profiles
    Huwang, Longcheen
    Lin, Jian-Chi
    Lin, Li-Wei
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (18) : 3620 - 3649