SC-OCR: similarity-based clustering and optimum cache replacement approach

被引:1
|
作者
Subramanian, Sabitha Malli [1 ]
Soundarajan, Vijayalakshmi [2 ]
机构
[1] Bharathiar Univ, Res & Dev Ctr, Coimbatore 641046, Tamil Nadu, India
[2] Thiagarajar Coll Engn, Dept Comp Applicat, Madurai 625015, Tamil Nadu, India
来源
关键词
big data computing; data cleaning; Hadoop; MapReduce; optimum cache replacement (OCR) algorithm; similarity-based clustering (SC); BIG; MAPREDUCE;
D O I
10.1002/cpe.3916
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Big data is a new term used to identify the large scale and complex datasets. Big data is now rapidly expanding in all science and engineering domains, owing to the fast development of networking, data storage, and data collection capacity. Big data mining is the capability of extracting useful information from these large datasets. Nowadays, the integration of cloud computing with the data mining for the big data mining process is a challenging task. In order to process the huge amount of data, it is necessary to concentrate the improvement on the big data computation. Most of the existing approaches use the MapReduce to compute the big data. The increase in the computational cost and memory consumption are the main drawbacks of the existing approaches. To overcome these limitations, this paper proposes a similarity-based clustering and optimum cache replacement approach for big data computing applications. The job recovery process is initiated by copying the data in the cloud server and forwarding the data copy for further processing. Then, the job is divided into clusters based on the similarity-based clustering approach. Finally, the cache concept is introduced with the optimum cache replacement algorithm to avoid repeated execution of the jobs by queue management. The proposed approach is compared with the existing Spark and Hadoop approaches. The proposed approach achieves better performance in terms of iteration time, query response time, job completion time, and clustering accuracy. Copyright (C) 2016 John Wiley & Sons, Ltd.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Similarity-based Attention Embedding Approach for Attributed Graph Clustering
    Weng, Wei
    Li, Tong
    Liao, Jian-Chao
    Guo, Feng
    Chen, Fen
    Wei, Bo-Wen
    Journal of Network Intelligence, 2022, 7 (04): : 848 - 861
  • [2] Similarity-based chemical clustering techniques
    Gute, BD
    Basak, SC
    Mills, D
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2005, 229 : U789 - U789
  • [3] A similarity-based robust clustering method
    Yang, MS
    Wu, KL
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (04) : 434 - 448
  • [4] A similarity-based approach to aggregation
    Jacas, J
    Recasens, J
    FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 658 - 662
  • [5] A similarity-based approach to prediction
    Gilboa, Itzhak
    Lieberman, Offer
    Schmeidler, David
    JOURNAL OF ECONOMETRICS, 2011, 162 (01) : 124 - 131
  • [6] Similarity-based approach to defuzzification
    Boixader, D
    Jacas, J
    Recasens, J
    PROCEEDINGS OF THE SIXTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS I - III, 1997, : 761 - 765
  • [7] The directional similarity-based clustering method DSCM
    School of Information Engineering, Southern Yangtze University, Wuxi 214036, China
    不详
    不详
    不详
    Jisuanji Yanjiu yu Fazhan, 2006, 8 (1425-1431):
  • [8] A Similarity-Based Clustering Algorithm for Fuzzy Data
    Hung, Wen-Liang
    Yang, Miin-Shen
    2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010), 2010,
  • [9] Similarity-based Fuzzy clustering for user profiling
    Castellano, Giovanna
    Fanelli, A. Maria
    Mencar, Corrado
    Torsello, M. Alessandra
    PROCEEDING OF THE 2007 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS, 2007, : 75 - 78
  • [10] Similarity-Based Clustering For IoT Device Classification
    Dupont, Guillaume
    Leite, Cristoffer
    dos Santos, Daniel Ricardo
    Costante, Elisa
    den Hartog, Jerry
    Etalle, Sandro
    2021 IEEE INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS (IEEE COINS 2021), 2021, : 104 - 110