Characterization of Power Usage and Performance in Data-Intensive Applications Using MapReduce over MPI

被引:2
|
作者
Davis, Joshua [1 ]
Gao, Tao [1 ]
Chandrasekaran, Sunita [1 ]
Jagode, Heike [2 ]
Danalis, Anthony [2 ]
Dongarra, Jack [2 ]
Balaji, Pavan [3 ]
Taufer, Michela [2 ]
机构
[1] Univ Delaware, Newark, DE 19716 USA
[2] Univ Tenneesee Knoxville, Knoxville, TN USA
[3] Argonne Natl Lab, Lemont, IL USA
来源
关键词
Data management; KNL; KNM; PAPI; Combiner optimizations;
D O I
10.3233/APC200053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a quantitative evaluation of the power usage over time in data-intensive applications that use MapReduce over MPI. We leverage the PAPI powercap tool to identify ideal conditions for execution of our mini-applications in terms of (1) dataset characteristics (e.g., unique words in datasets); (2) system characteristics (e.g., KNL and KNM); and (3) implementation of the MapReduce programming model (e.g., impact of various optimizations). Results illustrate the high power utilization and runtime costs of data management on HPC architectures.
引用
收藏
页码:287 / 298
页数:12
相关论文
共 50 条
  • [1] Accelerating Biomedical Data-Intensive Applications using MapReduce
    Han, Liangxiu
    Ong, Hwee Yong
    2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, : 49 - 57
  • [2] MapReduce Across Distributed Clusters for Data-intensive Applications
    Wang, Lizhe
    Tao, Jie
    Marten, Holger
    Streit, Achim
    Khan, Samee U.
    Kolodziej, Joanna
    Chen, Dan
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 2004 - 2011
  • [3] Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications
    Ahmad, Maaz Bin Safeer
    Cheung, Alvin
    SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 1205 - 1220
  • [4] Data-Intensive Text Processing with MapReduce
    Xu, Peng
    COMPUTATIONAL LINGUISTICS, 2011, 37 (03) : 635 - 637
  • [5] Design of Self-Adjusting algorithm for data-intensive MapReduce Applications
    Nagiwale, Amin Nazir
    Umale, Manish R.
    Sinha, Aditya Kumar
    2015 INTERNATIONAL CONFERENCE ON ENERGY SYSTEMS AND APPLICATIONS, 2015, : 506 - 510
  • [6] DL-MPI: Enabling Data Locality Computation for MPI-based Data-Intensive Applications
    Yin, Jiangling
    Foran, Andrew
    Wang, Jun
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [7] Bucket MapReduce: Relieving the Disk I/O Intensity of Data-Intensive Applications in MapReduce Frameworks
    Chen, Kai-Hsun
    Chen, Hsin-Yuan
    Wang, Chien-Min
    2021 29TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2021), 2021, : 18 - 25
  • [8] Understanding performance of distributed data-intensive applications
    Miceli, Christopher
    Miceli, Michael
    Rodriguez-Milla, Bety
    Jha, Shantenu
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2010, 368 (1926): : 4089 - 4102
  • [9] Performance of Scalable Off-The-Shelf Hardware for Data-intensive Parallel Processing using MapReduce
    Fadzil, Ahmad Firdaus Ahmad
    Khalid, Noor Elaiza Abdul
    Manaf, Mazani
    2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012), 2012, : 379 - 384
  • [10] Beyond MPI: New Communication Interfaces for Database Systems and Data-Intensive Applications
    Liu, Feilong
    Barthels, Claude
    Blanas, Spyros
    Kimura, Hideaki
    Swart, Garret
    SIGMOD RECORD, 2020, 49 (04) : 12 - 17