Evaluating Scalable Distributed Erlang for Scalability and Reliability

被引:7
|
作者
Chechina, Natalia [1 ]
MacKenzie, Kenneth [1 ]
Thompson, Simon [3 ]
Trinder, Phil [2 ]
Boudeville, Olivier [4 ]
Fordos, Viktoria [5 ]
Hoch, Csaba [5 ]
Ghaffari, Amir [1 ]
Hernandez, Mario Moro [1 ]
机构
[1] Univ Glasgow, Glasgow G12 8QQ, Lanark, Scotland
[2] Univ Glasgow, Comp Sci, Glasgow G12 8QQ, Lanark, Scotland
[3] Univ Kent, Sch Comp, Log & Computat, Canterbury CT2 7NZ, Kent, England
[4] EDF R&D, SINETICS Dept, F-92141 Clamart, France
[5] Erlang Solut AB, H-1093 Budapest, Hungary
基金
英国工程与自然科学研究理事会;
关键词
Scalability; reliability; actors; Erlang;
D O I
10.1109/TPDS.2017.2654246
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Large scale servers with hundreds of hosts and tens of thousands of cores are becoming common. To exploit these platforms software must be both scalable and reliable, and distributed actor languages like Erlang are a proven technology in this area. While distributed Erlang conceptually supports the engineering of large scale reliable systems, in practice it has some scalability limits that force developers to depart from the standard language mechanisms at scale. In earlier work we have explored these scalability limitations, and addressed them by providing a Scalable Distributed (SD) Erlang library that partitions the network of Erlang Virtual Machines (VMs) into scalable groups (s_groups). This paper presents the first systematic evaluation of SD Erlang s_groups and associated tools, and how they can be used. We present a comprehensive evaluation of the scalability and reliability of SD Erlang using three typical benchmarks and a case study. We demonstrate that s_groups improve the scalability of reliable and unreliable Erlang applications on up to 256 hosts (6,144 cores). We show that SD Erlang preserves the class-leading distributed Erlang reliability model, but scales far better than the standard model. We present a novel, systematic, and tool-supported approach for refactoring distributed Erlang applications into SD Erlang. We outline the new and improved monitoring, debugging and deployment tools for large scale SD Erlang applications. We demonstrate the scaling characteristics of key tools on systems comprising up to 10 K Erlang VMs.
引用
收藏
页码:2244 / 2257
页数:14
相关论文
共 50 条
  • [41] Modeling Scalability of Distributed Machine Learning
    Ulanov, Alexander
    Simanovsky, Andrey
    Marwah, Manish
    [J]. 2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 1249 - 1254
  • [42] Erlang distributed activity times in stochastic activity networks
    Abdelkader, YH
    [J]. KYBERNETIKA, 2003, 39 (03) : 347 - 358
  • [43] Verifying a distributed database lookup manager written in Erlang
    Arts, T
    Dam, M
    [J]. FM'99-FORMAL METHODS, 1999, 1708 : 682 - 700
  • [44] Scalability in a secure distributed proof system
    Minami, Kazuhiro
    Kotz, David
    [J]. PERVASIVE COMPUTING, PROCEEDINGS, 2006, 3968 : 220 - 237
  • [45] Causal-Consistent Debugging of Distributed Erlang Programs
    Fabbretti, Giovanni
    Lanese, Ivan
    Stefani, Jean-Bernard
    [J]. REVERSIBLE COMPUTATION (RC 2021), 2021, 12805 : 79 - 95
  • [46] Design and scalability of NLS, a scalable naming and location service
    Hu, YC
    Rodney, DA
    Druschel, P
    [J]. IEEE INFOCOM 2002: THE CONFERENCE ON COMPUTER COMMUNICATIONS, VOLS 1-3, PROCEEDINGS, 2002, : 1218 - 1227
  • [47] Scalable video transcoding method with spatial updatable scalability
    Kodama, M
    Suzuki, S
    [J]. 2004 47TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL I, CONFERENCE PROCEEDINGS, 2004, : 257 - 260
  • [48] Highly scalable Erlang framework for agent-based metaheuristic computing
    Turek, Wojciech
    Stypka, Jan
    Krzywicki, Daniel
    Anielski, Piotr
    Pietak, Kamil
    Byrski, Aleksander
    Kisiel-Dorohinicki, Marek
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2016, 17 : 234 - 248
  • [49] Evaluating the reliability of distributed photovoltaic energy system and storage against household blackout
    Yimeng Sun
    Jie Gao
    Jianxiao Wang
    Ziyang Huang
    Gengyin Li
    Ming Zhou
    [J]. Global Energy Interconnection, 2021, 4 (01) : 18 - 27
  • [50] Evaluating the reliability of distributed photovoltaic energy system and storage against household blackout
    Sun, Yimeng
    Gao, Jie
    Wang, Jianxiao
    Huang, Ziyang
    Li, Gengyin
    Zhou, Ming
    [J]. GLOBAL ENERGY INTERCONNECTION-CHINA, 2021, 4 (01): : 18 - 27