Automated pipeline framework for processing of large-scale building energy time series data

被引:19
|
作者
Khalilnejad, Arash [1 ,5 ]
Karimi, Ahmad M. [2 ,5 ]
Kamath, Shreyas [1 ,5 ]
Haddadian, Rojiar [2 ,5 ]
French, Roger H. [2 ,4 ,5 ]
Abramson, Alexis R. [3 ,6 ,7 ]
机构
[1] Case Western Reserve Univ, Case Sch Engn, Dept Elect Comp & Syst Engn, Cleveland, OH 44106 USA
[2] Case Western Reserve Univ, Dept Comp & Data Sci, Case Sch Engn, Cleveland, OH 44106 USA
[3] Case Western Reserve Univ, Case Sch Engn, Dept Mech & Aerosp Engn, Cleveland, OH 44106 USA
[4] Case Western Reserve Univ, Dept Mat Sci & Engn, Case Sch Engn, Cleveland, OH 44106 USA
[5] Case Western Reserve Univ, Case Sch Engn, SDLE Res Ctr, Cleveland, OH 44106 USA
[6] Case Western Reserve Univ, Case Sch Engn, Great Lakes Energy Inst, Cleveland, OH 44106 USA
[7] Thayer Sch Engn Dartmouth, Hanover, NH USA
来源
PLOS ONE | 2020年 / 15卷 / 12期
关键词
MAXIMUM HYDROGEN-PRODUCTION; DATA ANALYTICS; PERFORMANCE; CONSUMPTION; OCCUPANCY; MANAGEMENT; SYSTEM; US;
D O I
10.1371/journal.pone.0240461
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Commercial buildings account for one third of the total electricity consumption in the United States and a significant amount of this energy is wasted. Therefore, there is a need for "virtual" energy audits, to identify energy inefficiencies and their associated savings opportunities using methods that can be non-intrusive and automated for application to large populations of buildings. Here we demonstrate virtual energy audits applied to large populations of buildings' time-series smart-meter data using a systematic approach and a fully automated Building Energy Analytics (BEA) Pipeline that unifies, cleans, stores and analyzes building energy datasets in a non-relational data warehouse for efficient insights and results. This BEA pipeline is based on a custom compute job scheduler for a high performance computing cluster to enable parallel processing of Slurm jobs. Within the analytics pipeline, we introduced a data qualification tool that enhances data quality by fixing common errors, while also detecting abnormalities in a building's daily operation using hierarchical clustering. We analyze the HVAC scheduling of a population of 816 buildings, using this analytics pipeline, as part of a cross-sectional study. With our approach, this sample of 816 buildings is improved in data quality and is efficiently analyzed in 34 minutes, which is 85 times faster than the time taken by a sequential processing. The analytical results for the HVAC operational hours of these buildings show that among 10 building use types, food sales buildings with 17.75 hours of daily HVAC cooling operation are decent targets for HVAC savings. Overall, this analytics pipeline enables the identification of statistically significant results from population based studies of large numbers of building energy time-series datasets with robust results. These types of BEA studies can explore numerous factors impacting building energy efficiency and virtual building energy audits. This approach enables a new generation of data-driven buildings energy analysis at scale.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] TIME SERIES ANALYSIS ABOUT A SET OF LARGE-SCALE CLIMATE DATA
    Zhao, Linlin
    Wang, Chengshan
    Huo, Zhenyu
    INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE & TECHNOLOGY: PROCEEDINGS, 2012, : 101 - 105
  • [22] A general-purpose framework for parallel processing of large-scale LiDAR data
    Li, Zhenlong
    Hodgson, Michael E.
    Li, Wenwen
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2018, 11 (01) : 26 - 47
  • [23] Towards a framework for large-scale multimedia data storage and processing on Hadoop platform
    Wei Kuang Lai
    Yi-Uan Chen
    Tin-Yu Wu
    Mohammad S. Obaidat
    The Journal of Supercomputing, 2014, 68 : 488 - 507
  • [24] Towards a framework for large-scale multimedia data storage and processing on Hadoop platform
    Lai, Wei Kuang
    Chen, Yi-Uan
    Wu, Tin-Yu
    Obaidat, Mohammad S.
    JOURNAL OF SUPERCOMPUTING, 2014, 68 (01): : 488 - 507
  • [25] Large-Scale Real-Time Semantic Processing Framework for Internet of Things
    Chen, Xi
    Chen, Huajun
    Zhang, Ningyu
    Huang, Jue
    Zhang, Wen
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,
  • [26] Experimental data on filling and emptying of a large-scale pipeline
    Chen, Xifeng
    Hou, Qingzhi
    Laanearu, Janek
    Tijsseling, Arris S.
    SCIENTIFIC DATA, 2024, 11 (01)
  • [27] Automated VARTM processing of large-scale composite structures
    Heider, Dirk
    Gillespie Jr., J.W.
    1600, Soc. for the Advancement of Material and Process Engineering (36):
  • [28] Automated VARTM processing of large-scale composite structures
    Heider, D
    Gillespie, JW
    JOURNAL OF ADVANCED MATERIALS, 2004, 36 (04): : 11 - 17
  • [29] Study of Information Extraction Method of Large-Scale Processing Pipeline
    Zhang, Shuxuan
    Wang, Zhe
    Zhang, Qing
    Pang, Litao
    INTERNATIONAL CONFERENCE ON MATERIALS PROCESSING AND MECHANICAL MANUFACTURING ENGINEERING (MPMME 2015), 2015, : 109 - 113
  • [30] A Data-Driven Time-Series Fault Prediction Framework for Dynamically Evolving Large-Scale Data Streaming Systems
    Michell Hell
    Eduardo Pestana de Aguiar
    Nielson Soares
    Leonardo Goliatt
    International Journal of Fuzzy Systems, 2022, 24 : 2831 - 2844