Scheduling of big data applications on distributed cloud based on QoS parameters

被引:0
|
作者
Rajinder Sandhu
Sandeep K. Sood
机构
[1] Guru Nanak Dev University,
来源
Cluster Computing | 2015年 / 18卷
关键词
Big data; Cloud computing; Quality of Service (QoS); Hadoop; Self organizing maps; K nearest neighbor;
D O I
暂无
中图分类号
学科分类号
摘要
Big data is one of the major technology usages for business operations in today’s competitive market. It provides organizations a powerful tool to analyze large unstructured data to make useful decisions. Result quality, time, and price associated with big data analytics are very important aspects for its success. Selection of appropriate cloud infrastructure at coarse and fine grained level will ensure better results. In this paper, a global architecture is proposed for QoS based scheduling for big data application to distributed cloud datacenter at two levels which are coarse grained and fine grained. At coarse grain level, appropriate local datacenter is selected based on network distance between user and datacenter, network throughput and total available resources using adaptive K nearest neighbor algorithm. At fine grained level, probability triplet (C, I, M) is predicted using naïve Bayes algorithm which provides probability of new application to fall in compute intensive (C), input/output intensive (I) and memory intensive (M) categories. Each datacenter is transformed into a pool of virtual clusters capable of executing specific category of jobs with specific (C, I, M) requirements using self organized maps. Novelty of study is to represent whole datacenter resources in a predefined topological ordering and executing new incoming jobs in their respective predefined virtual clusters based on their respective QoS requirements. Proposed architecture is tested on three different Amazon EMR datacenters for resource utilization, waiting time, availability, response time and estimated time to complete the job. Results indicated better QoS achievement and 33.15 % cost gain of the proposed architecture over traditional Amazon methods.
引用
收藏
页码:817 / 828
页数:11
相关论文
共 50 条
  • [1] Scheduling of big data applications on distributed cloud based on QoS parameters
    Sandhu, Rajinder
    Sood, Sandeep K.
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02): : 817 - 828
  • [2] Online Task Scheduling of Big Data Applications in the Cloud Environment
    Bouhouch, Laila
    Zbakh, Mostapha
    Tadonki, Claude
    [J]. INFORMATION, 2023, 14 (05)
  • [3] A Framework for Scheduling and Managing Big Data Applications in a Distributed Infrastructure
    Govindarajan, Kannan
    Somasundaram, Thamarai Selvi
    Boulanger, David
    Kumar, Vivekanandan Suresh
    Kinshuk
    [J]. 2015 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2015,
  • [4] Scheduling workflows with privacy protection constraints for big data applications on cloud
    Wen, Yiping
    Liu, Jianxun
    Dou, Wanchun
    Xu, Xiaolong
    Cao, Buqing
    Chen, Jinjun
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 108 : 1084 - 1091
  • [5] Intelligent cloud workflow management and scheduling method for big data applications
    Hu, Yannian
    Wang, Hui
    Ma, Wenge
    [J]. JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2020, 9 (01):
  • [6] Intelligent cloud workflow management and scheduling method for big data applications
    Yannian Hu
    Hui Wang
    Wenge Ma
    [J]. Journal of Cloud Computing, 9
  • [7] Hybrid Big Bang-Big Crunch based resource scheduling to improve QoS in cloud infrastructure
    Gupta, Punit
    Saini, Dinesh Kumar
    Rawat, Pradeep Singh
    Bhagat, Sajit
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (02) : 1887 - 1895
  • [8] Priority-based Resource Scheduling in Distributed Stream Processing Systems for Big Data Applications
    Bellavista, Paolo
    Corradi, Antonio
    Reale, Andrea
    Ticca, Nicola
    [J]. 2014 IEEE/ACM 7TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC), 2014, : 363 - 370
  • [9] An Efficient Scheduling of HPC Applications on Geographically Distributed Cloud Data Centers
    Rajabi, Aboozar
    Faragardi, Hamid Reza
    Nolte, Thomas
    [J]. COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS, CNDS 2013, 2014, 428 : 155 - 167
  • [10] Cloud Based Web Scraping for Big Data Applications
    Chaulagain, Ram Sharan
    Pandey, Santosh
    Basnet, Sadhu Ram
    Shakya, Subarna
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2017, : 138 - 143