HMM Optimized Modeling of SSD Storage for I/O MapReduce Workloads

被引:0
|
作者
Alsayoud, Fatimah [1 ]
Miri, Ali [1 ]
机构
[1] Ryerson Univ, Dept Comp Sci, Toronto, ON, Canada
关键词
Flash resource management; R/W ratio; IO patterns; Hidden Markov Model; Storage policies; MapReduce Workloads;
D O I
10.1109/iemcon.2019.8936243
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Flash-based SSD draws a considerable interest in big data platforms due to its performance and reliability. However, it still has limited usage as a result of its high cost and limited capacity. Control SSD provisioning on big data platforms reduce storage cost and guarantees performance. The workload is an essential SSD provisioning sources, thus analyzing the characteristics of the workloads would help optimize SSD management design. There is a significant correlation between the workload's IO patterns and the SSD cost and performance. Big data platforms with multi-stage architecture bring challenges into modeling IO patterns where each stage has it is unique IO patterns. Also, big data platforms run on a distributed environment where the workloads are interacting with local and remote storage during the execution. The designed HMM-based IO patterns model considers IO patterns for MapReduce workloads at different stages and different SSD locations. In this paper, we proposed a platform-level SSD, cost-efficiency controller. The controller is responsible for maximizing the SSD lifespan on the Hadoop platform through two phases. First, modeling MapReduce workload's IO patterns by employing the Hidden Markov Model (HMM). Then, defining platform-level SSD allocation policies. The designed allocation policies reduce SSD utilization and improve SSD lifespan on Hadoop by up to %40 compared to static allocation policies.
引用
收藏
页码:177 / 183
页数:7
相关论文
共 50 条
  • [31] Investigating Machine Learning Algorithms for Modeling SSD I/O Performance for Container-Based Virtualization
    Dartois, Jean-Emile
    Boukhobza, Jalil
    Knefati, Anas
    Barais, Olivier
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2021, 9 (03) : 1103 - 1116
  • [32] An Integrated Memory and SSD Caching I/O Subsystem
    Chang, Hsung-Pin
    He, Yu-Cain
    Chang, Da-Wei
    2018 IEEE 42ND ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2018, : 823 - 824
  • [33] LADIO: Leakage-Aware Direct I/O for I/O-Intensive Workloads
    Jeong, Ipoom
    Lou, Jiaqi
    Son, Yongseok
    Park, Yongjoo
    Yuan, Yifan
    Kim, Nam Sung
    IEEE COMPUTER ARCHITECTURE LETTERS, 2023, 22 (02) : 77 - 80
  • [34] An SR-IOV SSD Optimized for QoS-Sensitive IaaS Cloud Storage
    Chen, Xiang
    Ying, Ru
    Ma, Haocong
    Wang, Yao
    Menge, Xianjun
    Xie, Guangjun
    Zhan, Yonghui
    Yuan, Fenyong
    Yang, Ying
    Lu, Tao
    Wang, Jinqiang
    Zhou, You
    Wu, Fei
    2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 1161 - 1163
  • [35] Provision of Disk I/O Guarantee for MapReduce Applications
    Xuan Thi Tran
    Tien Van Do
    Do, Nam H.
    Farkas, Lorant
    Rotter, Csaba
    2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 2, 2015, : 161 - 166
  • [36] An Analysis of Network I/O Workloads in Virtualized Cloud Environment
    Pu Xing
    Yang Miao
    Dai Chao
    Lai Xingjun
    Liu Mengxiao
    Hu Jingjing
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 818 - +
  • [37] A Study of Self-similarity in Parallel I/O Workloads
    Zou, Qiang
    Zhu, Yifeng
    Feng, Dan
    2010 IEEE 26TH SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST), 2010,
  • [38] Survey of studies on self-similarity in I/O workloads
    Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
    Jisuanji Yanjiu yu Fazhan, 2008, 6 (1072-1084): : 1072 - 1084
  • [39] Evaluating Memory Energy Efficiency in Parallel I/O Workloads
    Yue, Jianhui
    Zhu, Yifeng
    Cai, Zhao
    2007 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, 2007, : 21 - 30
  • [40] Fault recovery designs for processor-embedded distributed storage architectures with I/O-intensive DB workloads
    Chiu, SC
    Choudhary, AN
    Kandemir, MT
    Twenty-Second IEEE/Thirteenth NASA Goddard Conference on Mass Storage Systems and Technologies, Proceedings: INFORMATION RETRIEVAL FROM VERY LARGE STORAGE SYSTEMS, 2005, : 278 - 285