Automated pipeline framework for processing of large-scale building energy time series data

被引：19

作者：

Khalilnejad, Arash ^{[1
,5
]}

Karimi, Ahmad M. ^{[2
,5
]}

Kamath, Shreyas ^{[1
,5
]}

Haddadian, Rojiar ^{[2
,5
]}

French, Roger H. ^{[2
,4
,5
]}

Abramson, Alexis R. ^{[3
,6
,7
]}

机构：

[1] Case Western Reserve Univ, Case Sch Engn, Dept Elect Comp & Syst Engn, Cleveland, OH 44106 USA

[2] Case Western Reserve Univ, Dept Comp & Data Sci, Case Sch Engn, Cleveland, OH 44106 USA

[3] Case Western Reserve Univ, Case Sch Engn, Dept Mech & Aerosp Engn, Cleveland, OH 44106 USA

[4] Case Western Reserve Univ, Dept Mat Sci & Engn, Case Sch Engn, Cleveland, OH 44106 USA

[5] Case Western Reserve Univ, Case Sch Engn, SDLE Res Ctr, Cleveland, OH 44106 USA

[6] Case Western Reserve Univ, Case Sch Engn, Great Lakes Energy Inst, Cleveland, OH 44106 USA

[7] Thayer Sch Engn Dartmouth, Hanover, NH USA

来源：

PLOS ONE | 2020年 / 15卷 / 12期

关键词：

MAXIMUM HYDROGEN-PRODUCTION; DATA ANALYTICS; PERFORMANCE; CONSUMPTION; OCCUPANCY; MANAGEMENT; SYSTEM; US;

D O I：

10.1371/journal.pone.0240461

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Commercial buildings account for one third of the total electricity consumption in the United States and a significant amount of this energy is wasted. Therefore, there is a need for "virtual" energy audits, to identify energy inefficiencies and their associated savings opportunities using methods that can be non-intrusive and automated for application to large populations of buildings. Here we demonstrate virtual energy audits applied to large populations of buildings' time-series smart-meter data using a systematic approach and a fully automated Building Energy Analytics (BEA) Pipeline that unifies, cleans, stores and analyzes building energy datasets in a non-relational data warehouse for efficient insights and results. This BEA pipeline is based on a custom compute job scheduler for a high performance computing cluster to enable parallel processing of Slurm jobs. Within the analytics pipeline, we introduced a data qualification tool that enhances data quality by fixing common errors, while also detecting abnormalities in a building's daily operation using hierarchical clustering. We analyze the HVAC scheduling of a population of 816 buildings, using this analytics pipeline, as part of a cross-sectional study. With our approach, this sample of 816 buildings is improved in data quality and is efficiently analyzed in 34 minutes, which is 85 times faster than the time taken by a sequential processing. The analytical results for the HVAC operational hours of these buildings show that among 10 building use types, food sales buildings with 17.75 hours of daily HVAC cooling operation are decent targets for HVAC savings. Overall, this analytics pipeline enables the identification of statistically significant results from population based studies of large numbers of building energy time-series datasets with robust results. These types of BEA studies can explore numerous factors impacting building energy efficiency and virtual building energy audits. This approach enables a new generation of data-driven buildings energy analysis at scale.

引用

页数：22

共 50 条

[1] An Analysis Framework for Large-Scale Time Series
Teng F.
Huang Q.-C.
Li T.-R.
Wang C.
Tian C.-H.
Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (07): : 1279 - 1292
[2] A Fast Semi-Supervised Clustering Framework for Large-Scale Time Series Data
He, Guoliang
Pan, Yanzhou
Xia, Xuewen
He, Jinrong
Peng, Rong
Xiong, Neal N.
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (07): : 4201 - 4216
[3] A large-scale framework for storage, access and analysis of time series data in the manufacturing domain
Moerzinger, Benjamin
Weiler, Thomas
Trautner, Thomas
Ayatollahi, Iman
Angerer, Bernhard
Kittl, Burkhard
11TH CIRP CONFERENCE ON INTELLIGENT COMPUTATION IN MANUFACTURING ENGINEERING, 2018, 67 : 595 - 600
[4] A real-time data acquisition and processing framework for large-scale robot skin
Youssefi, S.
Denei, S.
Mastrogiovanni, F.
Cannata, G.
ROBOTICS AND AUTONOMOUS SYSTEMS, 2015, 68 : 86 - 103
[5] Toward an Automated HPC Pipeline for Processing Large Scale Electron Microscopy Data
Vescovi, Rafael
Li, Hanyu
Kinnison, Jeffery
Keceli, Murat
Salim, Misha
Kasthuri, Narayanan
Uram, Thomas D.
Ferrier, Nicola
PROCEEDINGS OF 2020 IEEE/ACM 2ND ANNUAL WORKSHOP ON EXTREME-SCALE EXPERIMENT-IN-THE-LOOP COMPUTING (XLOOP 2020), 2020, : 16 - 22
[6] An open source analysis framework for large-scale building energy modeling
Ball, Brian L.
Long, Nicholas
Fleming, Katherine
Balbach, Chris
Lopez, Phylroy
JOURNAL OF BUILDING PERFORMANCE SIMULATION, 2020, 13 (05) : 487 - 500
[7] MRMkit: Automated Data Processing for Large-Scale Targeted Metabolomics Analysis
Teo, Guoshou
Chew, Wee Siong
Burla, Bo J.
Herr, Deron
Tai, E. Shyong
Wenk, Markus R.
Torta, Federico
Choi, Hyungwon
ANALYTICAL CHEMISTRY, 2020, 92 (20) : 13677 - 13682
[8] YADING: Fast Clustering of Large-Scale Time Series Data
Ding, Rui
Wang, Qiang
Dang, Yingnong
Fu, Qiang
Zhang, Haidong
Zhang, Dongmei
PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (05): : 473 - 484
[9] A Visualization Pipeline for Large-Scale Tractography Data
Kress, James
Anderson, Erik
Childs, Hank
2015 IEEE 5TH SYMPOSIUM ON LARGE DATA ANALYSIS AND VISUALIZATION (LDAV), 2015, : 115 - 123
[10] Marbor: A Novel Large-Scale Graph Data Storage and Processing Framework
Zhou, Wei
Gao, Yun
Han, Jizhong
Xu, Zhiyong
2014 IEEE INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2014,

← 1 2 3 4 5 →