Automated pipeline framework for processing of large-scale building energy time series data

被引:19
|
作者
Khalilnejad, Arash [1 ,5 ]
Karimi, Ahmad M. [2 ,5 ]
Kamath, Shreyas [1 ,5 ]
Haddadian, Rojiar [2 ,5 ]
French, Roger H. [2 ,4 ,5 ]
Abramson, Alexis R. [3 ,6 ,7 ]
机构
[1] Case Western Reserve Univ, Case Sch Engn, Dept Elect Comp & Syst Engn, Cleveland, OH 44106 USA
[2] Case Western Reserve Univ, Dept Comp & Data Sci, Case Sch Engn, Cleveland, OH 44106 USA
[3] Case Western Reserve Univ, Case Sch Engn, Dept Mech & Aerosp Engn, Cleveland, OH 44106 USA
[4] Case Western Reserve Univ, Dept Mat Sci & Engn, Case Sch Engn, Cleveland, OH 44106 USA
[5] Case Western Reserve Univ, Case Sch Engn, SDLE Res Ctr, Cleveland, OH 44106 USA
[6] Case Western Reserve Univ, Case Sch Engn, Great Lakes Energy Inst, Cleveland, OH 44106 USA
[7] Thayer Sch Engn Dartmouth, Hanover, NH USA
来源
PLOS ONE | 2020年 / 15卷 / 12期
关键词
MAXIMUM HYDROGEN-PRODUCTION; DATA ANALYTICS; PERFORMANCE; CONSUMPTION; OCCUPANCY; MANAGEMENT; SYSTEM; US;
D O I
10.1371/journal.pone.0240461
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Commercial buildings account for one third of the total electricity consumption in the United States and a significant amount of this energy is wasted. Therefore, there is a need for "virtual" energy audits, to identify energy inefficiencies and their associated savings opportunities using methods that can be non-intrusive and automated for application to large populations of buildings. Here we demonstrate virtual energy audits applied to large populations of buildings' time-series smart-meter data using a systematic approach and a fully automated Building Energy Analytics (BEA) Pipeline that unifies, cleans, stores and analyzes building energy datasets in a non-relational data warehouse for efficient insights and results. This BEA pipeline is based on a custom compute job scheduler for a high performance computing cluster to enable parallel processing of Slurm jobs. Within the analytics pipeline, we introduced a data qualification tool that enhances data quality by fixing common errors, while also detecting abnormalities in a building's daily operation using hierarchical clustering. We analyze the HVAC scheduling of a population of 816 buildings, using this analytics pipeline, as part of a cross-sectional study. With our approach, this sample of 816 buildings is improved in data quality and is efficiently analyzed in 34 minutes, which is 85 times faster than the time taken by a sequential processing. The analytical results for the HVAC operational hours of these buildings show that among 10 building use types, food sales buildings with 17.75 hours of daily HVAC cooling operation are decent targets for HVAC savings. Overall, this analytics pipeline enables the identification of statistically significant results from population based studies of large numbers of building energy time-series datasets with robust results. These types of BEA studies can explore numerous factors impacting building energy efficiency and virtual building energy audits. This approach enables a new generation of data-driven buildings energy analysis at scale.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] An Analysis Framework for Large-Scale Time Series
    Teng F.
    Huang Q.-C.
    Li T.-R.
    Wang C.
    Tian C.-H.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (07): : 1279 - 1292
  • [2] A Fast Semi-Supervised Clustering Framework for Large-Scale Time Series Data
    He, Guoliang
    Pan, Yanzhou
    Xia, Xuewen
    He, Jinrong
    Peng, Rong
    Xiong, Neal N.
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (07): : 4201 - 4216
  • [3] A large-scale framework for storage, access and analysis of time series data in the manufacturing domain
    Moerzinger, Benjamin
    Weiler, Thomas
    Trautner, Thomas
    Ayatollahi, Iman
    Angerer, Bernhard
    Kittl, Burkhard
    11TH CIRP CONFERENCE ON INTELLIGENT COMPUTATION IN MANUFACTURING ENGINEERING, 2018, 67 : 595 - 600
  • [4] A real-time data acquisition and processing framework for large-scale robot skin
    Youssefi, S.
    Denei, S.
    Mastrogiovanni, F.
    Cannata, G.
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2015, 68 : 86 - 103
  • [5] Toward an Automated HPC Pipeline for Processing Large Scale Electron Microscopy Data
    Vescovi, Rafael
    Li, Hanyu
    Kinnison, Jeffery
    Keceli, Murat
    Salim, Misha
    Kasthuri, Narayanan
    Uram, Thomas D.
    Ferrier, Nicola
    PROCEEDINGS OF 2020 IEEE/ACM 2ND ANNUAL WORKSHOP ON EXTREME-SCALE EXPERIMENT-IN-THE-LOOP COMPUTING (XLOOP 2020), 2020, : 16 - 22
  • [6] An open source analysis framework for large-scale building energy modeling
    Ball, Brian L.
    Long, Nicholas
    Fleming, Katherine
    Balbach, Chris
    Lopez, Phylroy
    JOURNAL OF BUILDING PERFORMANCE SIMULATION, 2020, 13 (05) : 487 - 500
  • [7] MRMkit: Automated Data Processing for Large-Scale Targeted Metabolomics Analysis
    Teo, Guoshou
    Chew, Wee Siong
    Burla, Bo J.
    Herr, Deron
    Tai, E. Shyong
    Wenk, Markus R.
    Torta, Federico
    Choi, Hyungwon
    ANALYTICAL CHEMISTRY, 2020, 92 (20) : 13677 - 13682
  • [8] YADING: Fast Clustering of Large-Scale Time Series Data
    Ding, Rui
    Wang, Qiang
    Dang, Yingnong
    Fu, Qiang
    Zhang, Haidong
    Zhang, Dongmei
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (05): : 473 - 484
  • [9] A Visualization Pipeline for Large-Scale Tractography Data
    Kress, James
    Anderson, Erik
    Childs, Hank
    2015 IEEE 5TH SYMPOSIUM ON LARGE DATA ANALYSIS AND VISUALIZATION (LDAV), 2015, : 115 - 123
  • [10] Marbor: A Novel Large-Scale Graph Data Storage and Processing Framework
    Zhou, Wei
    Gao, Yun
    Han, Jizhong
    Xu, Zhiyong
    2014 IEEE INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2014,