Event Segmentation using MapReduce based Big Data Clustering

被引:0
|
作者
Shafiq, M. Omair [1 ]
机构
[1] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Event segmentation; Clustering; Parallel; MapReduce;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Event segmentation is an important step in monitoring and management applications that categorizes different events into different segments. This is important especially when applications, to be monitored and managed, are large-scale, comprehensive and data-intensive in nature. The process of segmentation is based on data clustering which is one of the key data mining methods used these days. There are several decent algorithms and techniques that exist to perform clustering on small to medium scale data. In the era of Big Data and with applications being large-scale and data-intensive in nature, there is a significant increment in volume, variety and velocity of data in the form of log events produced by such applications. This makes the task of clustering of huge amounts of data more challenging and limited. This paper presents a proposed an effective and efficient approach of event segmentation in logs. It is based on parallel k-means clustering, inherited from MapReduce paradigm, to be used for event segmentation. The proposed approach has been tested and evaluated on large-scale log data derived from real-life case-study. Evaluation includes measuring efficiency and effectiveness of the proposed solution for its usability on log data with large volume, variety and velocity, as well as its applicability on large-scale applications.
引用
收藏
页码:1857 / 1866
页数:10
相关论文
共 50 条
  • [1] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [2] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [3] MapReduce based Method for Big Data Semantic Clustering
    Yang, Jie
    Li, Xiaoping
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 2814 - 2819
  • [4] Big data clustering with varied density based on MapReduce
    Safanaz Heidari
    Mahmood Alborzi
    Reza Radfar
    Mohammad Ali Afsharkazemi
    Ali Rajabzadeh Ghatari
    [J]. Journal of Big Data, 6
  • [5] Big data clustering with varied density based on MapReduce
    Heidari, Safanaz
    Alborzi, Mahmood
    Radfar, Reza
    Afsharkazemi, Mohammad Ali
    Ghatari, Ali Rajabzadeh
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [6] Improved CURE Clustering for Big Data using Hadoop and Mapreduce
    Lathiya, Piyush
    Rani, Rinkle
    [J]. 2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 241 - 245
  • [7] Privacy Preserving Parallel Clustering Based Anonymization for Big Data Using MapReduce Framework
    Lawrance, Josephine Usha
    Jesudhasan, Jesu Vedha Nayahi
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2021, 35 (15) : 1587 - 1620
  • [8] EMR: Scalable Clustering of Big HR Data using Evolutionary MapReduce
    Bohlouli, Mahdi
    He, Zhonghua
    [J]. WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 26 - 34
  • [9] Optimized big data K-means clustering using MapReduce
    Cui, Xiaoli
    Zhu, Pingfei
    Yang, Xin
    Li, Keqiu
    Ji, Changqing
    [J]. JOURNAL OF SUPERCOMPUTING, 2014, 70 (03): : 1249 - 1259
  • [10] Optimized big data K-means clustering using MapReduce
    Xiaoli Cui
    Pingfei Zhu
    Xin Yang
    Keqiu Li
    Changqing Ji
    [J]. The Journal of Supercomputing, 2014, 70 : 1249 - 1259