On the distributed software architecture of a data analysis workflow: A case study

被引:0
|
作者
Tasgetiren, Nail [1 ]
Tigrak, Umit [2 ]
Bozan, Erdal [2 ]
Gul, Guven [2 ]
Demirci, Emir [2 ]
Saribiyik, Hakan [2 ]
Aktas, Mehmet S. [1 ]
机构
[1] Yildiz Tech Univ, Dept Comp Engn, Istanbul, Turkey
[2] Fibabanka, Res & Dev Ctr, Istanbul, Turkey
来源
关键词
data analysis workflow; distributed software architecture; facade design pattern; lambda software architecture; machine learning workflows;
D O I
10.1002/cpe.6522
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Hybrid distributed computing software architectures gain great importance in data analysis workflows as the number of available underlying machine learning libraries and data storage systems increase. We argue that there is a need for novel approaches for software architecture designs that can enable machine learning data analysis workflows to run on top of different subsystem libraries. To address this need, we propose a hybrid distributed software architecture in this manuscript. The proposed architecture manages machine learning models for both supervised and unsupervised machine learning data analysis workflows. To show the usability of the proposed architecture, we implement a prototype for the banking sector as a case study. The prototype application includes two data analysis workflows: a workflow for predicting the loan usage tendency of customers, and a workflow for clustering the customers based on the usage patterns of banking loans. The prototype is tested on a large scale banking dataset. Performance tests were carried out to investigate the performance in terms of both responsiveness and scalability of the system. The results obtained reveal the usability of the proposed architecture.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Performance of a distributed regression analysis software and workflow
    Her, Qoua
    Vilk, Yury
    Young, Jessica
    Zhang, Zilu
    Malenfant, Jessica
    Malek, Sarah
    Toh, Sengwee
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2018, 27 : 27 - 27
  • [2] A Novel Distributed Software Architecture for Managing Customer Behavior Data: A Case Study in Banking Sector
    Kargili, Ozer Batu
    Arik, Ahmet Okan
    Bekler, Merve
    Kose, Osman Uygar
    Aktas, Mehmet S.
    [J]. 2021 21ST INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ITS APPLICATIONS ICCSA 2021, 2021, : 211 - 217
  • [3] Towards an Adaptive and Distributed Architecture for Managing Workflow Provenance Data
    Costa, Flavio
    de Oliveira, Daniel
    Mattoso, Marta
    [J]. 2014 IEEE 10TH INTERNATIONAL CONFERENCE ON ESCIENCE WORKSHOPS (ESCIENCE 2014), VOL 2, 2014, : 79 - 82
  • [4] The study on secure distributed workflow architecture based SOA
    Bai Xiaoming
    Song Ruliang
    Hou Zonghan
    [J]. 2006 INTERNATIONAL CONFERENCE ON POWER SYSTEMS TECHNOLOGY: POWERCON, VOLS 1- 6, 2006, : 1612 - +
  • [5] Component-based workflow architecture of a distributed software process management system
    Xie, YY
    Zhang, WS
    [J]. THIRD INTERNATIONAL CONFERENCE ON QUALITY SOFTWARE, PROCEEDINGS, 2003, : 204 - 210
  • [6] Analysis on Software Bus Architecture of Distributed Computer
    Wang, Rui
    [J]. PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND COMPUTING TECHNOLOGY (ICMMCT 2017), 2017, 126 : 1501 - 1505
  • [7] Analysis on the Distributed Computer Software Bus Architecture
    Liu, Fang
    [J]. PROCEEDINGS OF THE 2018 3RD INTERNATIONAL WORKSHOP ON MATERIALS ENGINEERING AND COMPUTER SCIENCES (IWMECS 2018), 2018, 78 : 114 - 117
  • [8] A two-layered software architecture for distributed workflow coordination over web services
    Balasooriya, Janaka
    Joshi, Jaimini
    Prasad, Sushil K.
    Navathe, Shamkant
    [J]. ICWS 2006: IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, PROCEEDINGS, 2006, : 933 - +
  • [9] A software architecture for workflow management systems
    Jablonski, S
    [J]. NINTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 1998, : 739 - 744
  • [10] WIDE - A distributed architecture for workflow management
    Ceri, S
    Grefen, P
    Sanchez, G
    [J]. SEVENTH INTERNATIONAL WORKSHOP ON RESEARCH ISSUES IN DATA ENGINEERING, PROCEEDINGS: HIGH PERFORMANCE DATABASE MANAGEMENT FOR LARGE-SCALE APPLICATIONS, 1997, : 76 - 79