Scalable graph-based OLAP analytics over process execution data

被引:37
|
作者
Beheshti, Seyed-Mehdi-Reza [1 ]
Benatallah, Boualem [1 ]
Motahari-Nezhad, Hamid Reza [1 ,2 ]
机构
[1] Univ New S Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
[2] IBM Almaden Res Ctr, San Jose, CA USA
关键词
Process analytics; Business analytics; Bigdata analytics; Graph OLAP; OLAP; SYSTEMS; SPARQL; INFORMATION; MAPREDUCE; PATTERNS; DESIGN; MODELS;
D O I
10.1007/s10619-014-7171-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In today's knowledge-, service-, and cloud-based economy, businesses accumulate massive amounts of data from a variety of sources. In order to understand businesses one may need to perform considerable analytics over large hybrid collections of heterogeneous and partially unstructured data that is captured related to the process execution. This data, usually modeled as graphs, increasingly come to show all the typical properties of big data: wide physical distribution, diversity of formats, non-standard data models, independently-managed and heterogeneous semantics. We use the term big process graph to refer to such large hybrid collections of heterogeneous and partially unstructured process related execution data. Online analytical processing (OLAP) of big process graph is challenging as the extension of existing OLAP techniques to analysis of graphs is not straightforward. Moreover, process data analysis methods should be capable of processing and querying large amount of data effectively and efficiently, and therefore have to be able to scale well with the infrastructure's scale. While traditional analytics solutions (relational DBs, data warehouses and OLAP), do a great job in collecting data and providing answers on known questions, key business insights remain hidden in the interactions among objects: it will be hard to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. In this paper, we introduce a framework and a set of methods to support scalable graph-based OLAP analytics over process execution data. The goal is to facilitate the analytics over big process graph through summarizing the process graph and providing multiple views at different granularity. To achieve this goal, we present a model for process OLAP (P-OLAP) and define OLAP specific abstractions in process context such as process cubes, dimensions, and cells. We present a MapReduce-based graph processing engine, to support big data analytics over process graphs. We have implemented the P-OLAP framework and integrated it into our existing process data analytics platform, ProcessAtlas, which introduces a scalable architecture for querying, exploration and analysis of large process data. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.
引用
收藏
页码:379 / 423
页数:45
相关论文
共 50 条
  • [1] Scalable graph-based OLAP analytics over process execution data
    Seyed-Mehdi-Reza Beheshti
    Boualem Benatallah
    Hamid Reza Motahari-Nezhad
    Distributed and Parallel Databases, 2016, 34 : 379 - 423
  • [2] Influence maximization in graph-based OLAP (GOLAP)
    Jin, Jenny
    Zhang, Guigang
    Sheu, Phillip
    Hayakawa, Masahiro
    Kitazawa, Atsushi
    SOCIAL NETWORK ANALYSIS AND MINING, 2019, 9 (01)
  • [3] Influence maximization in graph-based OLAP (GOLAP)
    Jenny Jin
    Guigang Zhang
    Phillip Sheu
    Masahiro Hayakawa
    Atsushi Kitazawa
    Social Network Analysis and Mining, 2019, 9
  • [4] Graph-Based Techniques for Visual Analytics of Scientific Data Sets
    Wang, Chaoli
    COMPUTING IN SCIENCE & ENGINEERING, 2018, 20 (01) : 93 - 103
  • [5] Graph-based techniques for visual analytics of scientific data sets
    Wang C.
    Computing in Science and Engineering, 2018, 20 (01): : 93 - 103
  • [6] ELPIS: Graph-Based Similarity Search for Scalable Data Science
    Azizi, Ilias
    Echihabi, Karima
    Palpanas, Themis
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (06): : 1548 - 1559
  • [7] Graph-based Interactive Data Federation System for Heterogeneous Data Retrieval and Analytics
    Vu, Xuan-Son
    Ait-Mlouk, Addi
    Elmroth, Erik
    Jiang, Lili
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3595 - 3599
  • [8] DIVE: A Graph-Based Visual-Analytics Framework for Big Data
    Rysavy, Steven J.
    Bromley, Dennis
    Daggett, Valerie
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2014, 34 (02) : 26 - 37
  • [9] Graph-Based Speculative Query Execution for RDBMS
    Sasak-Okon, Anna
    Tudruj, Marek
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2017), PT I, 2018, 10777 : 303 - 313
  • [10] Bridging the Gap between Relational OLTP and Graph-based OLAP
    Shen, Sijie
    Yao, Zihang
    Shi, Lin
    Wang, Lei
    Lai, Longbin
    Tao, Qian
    Su, Li
    Chen, Rong
    Yu, Wenyuan
    Chen, Haibo
    Zang, Binyu
    Zhou, Jingren
    PROCEEDINGS OF THE 2023 USENIX ANNUAL TECHNICAL CONFERENCE, 2023, : 181 - 196