A parallel runtime framework for communication intensive stream applications

被引:0
|
作者
Muralidharan, Servesh [1 ]
Casey, Kevin [2 ]
Gregg, David [1 ]
机构
[1] Univ Dublin Trinity Coll, Lero, Dublin 2, Ireland
[2] Dublin City Univ, Lero & Sch Comp, Dublin 9, Ireland
关键词
Streaming model; data parallelism; clusters;
D O I
10.1109/TrustCom.2013.142
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Stream applications are often limited in their performance by their underlying communication system. A typical implementation relies on the operating system to handle the majority of network operations. In such cases, the communication stack, which was not designed to handle tremendous amounts of data, acts as a bottleneck and restricts the performance of the application. In this paper, we propose a parallel runtime framework that integrates the communication operations with stream applications, and provides a common parallel processing engine that can execute both the communication and computation operations in parallel on multicore processors. We place an emphasis on the low-level details required to implement such a framework, but also provide some guidelines on how an application programmer can employ the framework. Our runtime system uses a set of operations represented as filters to perform the relevant computations on the data stream. Filters that handle the application specific operations are categorized as computation filters and those that transform data to and from network devices are classified as communication filters. Computation filters are designed by the user and are specific to the application. Communication filters are provided by the runtime system and are built using system software that allows direct access to network hardware. Such system software allows the network operations to be performed by the runtime system in parallel, leading to better communication performance. Applications that are designed for this framework are built by constructing application specific computation filters and then connecting them to the communication filters provided by the runtime system. This abstracts the low-level programming of network adapters and protocols by the application developer, making it easier to build stream applications that take advantage of the improved communication performance. Moreover, by dynamically replicating and statically scheduling such filters on the given multicore architecture, it is possible for the runtime system to process multiple data streams in parallel. We are able to parallelize stream applications and achieve speedups of more than a factor of eight in all the applications we tested. The results show that our system scales to as many parallel processes as there are cores on our computer, and achieves speedups of more than a factor of ten in some cases compared to sequential implementations.
引用
收藏
页码:1179 / 1187
页数:9
相关论文
共 50 条
  • [31] COBRA: An Adaptive Runtime Binary Optimization Framework for Multithreaded Applications
    Kim, Jinpyo
    Hsu, Wei-Chung
    Yew, Pen-Chung
    2007 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPP), 2007, : 206 - 214
  • [32] A fast and accurate technique for mapping parallel applications on stream-oriented MPSoC platforms with communication awareness
    Ruggiero, Martino
    Guerri, Alessio
    Bertozzi, Davide
    Milano, Michela
    Benini, Luca
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2008, 36 (01) : 3 - 36
  • [33] A Fast and Accurate Technique for Mapping Parallel Applications on Stream-Oriented MPSoC Platforms with Communication Awareness
    Martino Ruggiero
    Alessio Guerri
    Davide Bertozzi
    Michela Milano
    Luca Benini
    International Journal of Parallel Programming, 2008, 36 : 3 - 36
  • [34] MPI-RCDD: A Framework for MPI Runtime Communication Deadlock Detection
    Wei, Hong-Mei
    Gao, Jian
    Qing, Peng
    Yu, Kang
    Fang, Yan-Fei
    Li, Ming-Lu
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (02) : 395 - 411
  • [35] A Runtime Framework for Context-Sensitive Device-to-Device Communication
    Zhang, Yan
    Song, Zheng
    Tian, Ye
    Wang, Wendong
    2017 IEEE 86TH VEHICULAR TECHNOLOGY CONFERENCE (VTC-FALL), 2017,
  • [36] Decentralized Stream Runtime Verification
    Miguel Danielsson, Luis
    Sanchez, Cesar
    RUNTIME VERIFICATION, RV 2019, 2019, 11757 : 185 - 201
  • [37] Runtime Power Limiting of Parallel Applications on Intel Xeon Phi Processors
    Lawson, Gary
    Sundriyal, Vaibhav
    Sosonkina, Masha
    Shen, Yuzhong
    PROCEEDINGS OF 4TH INTERNATIONAL WORKSHOP ON ENERGY EFFICIENT SUPERCOMPUTING (E2SC 2016), 2016, : 39 - 45
  • [38] Building User-defined Runtime Adaptation Routines for Stream Processing Applications
    Jacques-Silva, Gabriela
    Gedik, Bugra
    Wagle, Rohit
    Wu, Kun-Lung
    Kumar, Vibhore
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 1826 - 1837
  • [39] Stream Runtime Monitoring on UAS
    Adolf, Florian-Michael
    Faymonville, Peter
    Finkbeiner, Bernd
    Schirmer, Sebastian
    Torens, Christoph
    RUNTIME VERIFICATION (RV 2017), 2017, 10548 : 33 - 49
  • [40] Design and implementation of a parallel I/O runtime system for irregular applications
    No, J
    Park, SS
    Carretero, J
    Choudhary, A
    Chen, P
    FIRST MERGED INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM & SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, 1998, : 280 - 284