Scalable graph-based OLAP analytics over process execution data

被引:0
|
作者
Seyed-Mehdi-Reza Beheshti
Boualem Benatallah
Hamid Reza Motahari-Nezhad
机构
[1] University of New South Wales,School of Computer Science and Engineering
[2] IBM Almaden Research Center,undefined
来源
关键词
Process analytics; Business analytics; Bigdata analytics; Graph OLAP; OLAP;
D O I
暂无
中图分类号
学科分类号
摘要
In today’s knowledge-, service-, and cloud-based economy, businesses accumulate massive amounts of data from a variety of sources. In order to understand businesses one may need to perform considerable analytics over large hybrid collections of heterogeneous and partially unstructured data that is captured related to the process execution. This data, usually modeled as graphs, increasingly come to show all the typical properties of big data: wide physical distribution, diversity of formats, non-standard data models, independently-managed and heterogeneous semantics. We use the term big process graph to refer to such large hybrid collections of heterogeneous and partially unstructured process related execution data. Online analytical processing (OLAP) of big process graph is challenging as the extension of existing OLAP techniques to analysis of graphs is not straightforward. Moreover, process data analysis methods should be capable of processing and querying large amount of data effectively and efficiently, and therefore have to be able to scale well with the infrastructure’s scale. While traditional analytics solutions (relational DBs, data warehouses and OLAP), do a great job in collecting data and providing answers on known questions, key business insights remain hidden in the interactions among objects: it will be hard to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. In this paper, we introduce a framework and a set of methods to support scalable graph-based OLAP analytics over process execution data. The goal is to facilitate the analytics over big process graph through summarizing the process graph and providing multiple views at different granularity. To achieve this goal, we present a model for process OLAP (P-OLAP) and define OLAP specific abstractions in process context such as process cubes, dimensions, and cells. We present a MapReduce-based graph processing engine, to support big data analytics over process graphs. We have implemented the P-OLAP framework and integrated it into our existing process data analytics platform, ProcessAtlas, which introduces a scalable architecture for querying, exploration and analysis of large process data. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.
引用
收藏
页码:379 / 423
页数:44
相关论文
共 50 条
  • [41] FoodBroker - Generating Synthetic Datasets for Graph-Based Business Analytics
    Petermann, Andre
    Junghanns, Martin
    Mueller, Robert
    Rahm, Erhard
    BIG DATA BENCHMARKING, WBDB 2014, 2015, 8991 : 145 - 155
  • [42] Robust classification of graph-based data
    Alaiz, Carlos M.
    Fanuel, Michael
    Suykens, Johan A. K.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 33 (01) : 230 - 251
  • [43] Robust classification of graph-based data
    Carlos M. Alaíz
    Michaël Fanuel
    Johan A. K. Suykens
    Data Mining and Knowledge Discovery, 2019, 33 : 230 - 251
  • [44] Graph-based skeleton data compression
    Das, Pratyusha
    Ortega, Antonio
    2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2020,
  • [45] Graph-Based Data Clustering with Overlaps
    Fellows, Michael R.
    Guo, Jiong
    Komusiewicz, Christian
    Niedermeier, Rolf
    Uhlmann, Johannes
    COMPUTING AND COMBINATORICS, PROCEEDINGS, 2009, 5609 : 516 - +
  • [46] Graph-based data clustering with overlaps
    Fellows, Michael R.
    Guo, Jiong
    Komusiewicz, Christian
    Niedermeier, Rolf
    Uhlmann, Johannes
    DISCRETE OPTIMIZATION, 2011, 8 (01) : 2 - 17
  • [47] Graph-Based RDF Data Management
    Zou L.
    Özsu M.T.
    Data Science and Engineering, 2017, 2 (1) : 56 - 70
  • [48] Graph-based Transform for Data Decorrelation
    Hou, Junhui
    Liu, Hui
    Chau, Lap-Pui
    2016 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2016, : 177 - 180
  • [49] Graph-based induction for general graph structured data
    Matsuda, T
    Horiuchi, T
    Motoda, H
    Washio, T
    Kumazawa, K
    Arai, N
    DISCOVERY SCIENCE, PROCEEDINGS, 1999, 1721 : 340 - 342
  • [50] kNN-MST-Agglomerative: A Fast and Scalable Graph-based Data Clustering Approach on GPU
    Arefin, Ahmed Shamsul
    Riveros, Carlos
    Berretta, Regina
    Moscato, Pablo
    PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 585 - 590