Handling Big Data Efficiently by using Map Reduce Technique

被引:9
|
作者
Maitrey, Seema [1 ]
Jha, C. K. [2 ]
机构
[1] Krishna Inst Engn & Technol, Dept CSE, Ghaziabad, UP, India
[2] Banasthali Univ, Dept CSE, Niwai, Rajasthan, India
关键词
Data Mining; Clustering; DBMS; Parallel processing; Hadoop; MapReduce; MAPREDUCE; PERFORMANCE;
D O I
10.1109/CICT.2015.140
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extremely large amount of data is being captured by today's organizations and is continue to increase. It becomes computationally inefficient to analyze such huge data. Research in data mining has addressed problem in discovering knowledge from these continuously growing large data sets. The amount of raw data available has been increasing at an exponential rate. The valuable information is hidden in large databases. Data mining has become an interesting area to extract the embedded precious information from them. For many years it has been found its root in all kinds of application areas. Thus, gave evolution to many data mining methods which started to get applied in several real life fields. But not all the methods possess the capability to deal with and handle the huge collection of data. In recent years, numbers of computation and data intensive scientific data analyses are established. To perform the large scale data mining analyses so as to meet the scalability and performance requirements of big data, several efficient parallel and concurrent algorithms got applied. A lot of parallel algorithms are put into action using different parallelization techniques. Among them, some common techniques used are threads, MPI, MapReduce etc. which yield different performance and usability characteristics. In computing rigorous problems, the MPI model works efficiently. But it is a complicated task to bring this model into the practical use. There is currently considerable enthusiasm around the MapReduce paradigm for large-scale data analysis. It is inspired by functional programming which allows expressing distributed computations on massive amounts of data. It is designed for large-scale data processing as it allows to run on clusters of commodity hardware. A prominent parallel data processing tool MapReduce is gaining significant momentum from both industry and academia as the volume of data to analyze grows rapidly. In this paper, we are going to work around MapReduce, its advantages, disadvantages and how it can be used in integration with other technology.
引用
收藏
页码:703 / 708
页数:6
相关论文
共 50 条
  • [1] BIG DATA ANALYSIS FOR HEART DISEASE DETECTION SYSTEM USING MAP REDUCE TECHNIQUE
    Vaishali, G.
    Kalaivani, V.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTING TECHNOLOGIES AND INTELLIGENT DATA ENGINEERING (ICCTIDE'16), 2016,
  • [2] Unstructured Data Analysis on Big Data using Map Reduce
    Subramaniyaswamy, V
    Vijayakumar, V.
    Logesh, R.
    Indragandhi, V
    [J]. BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 456 - 465
  • [3] Addressing Big Data Problem Using Hadoop and Map Reduce
    Patel, Aditya B.
    Birla, Manashvi
    Nair, Ushma
    [J]. 3RD NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2012), 2012,
  • [4] MODEL OF BIG DATA MAP/REDUCE PROCESSING
    Orozova, Daniela
    Atanassov, Krassimir
    [J]. COMPTES RENDUS DE L ACADEMIE BULGARE DES SCIENCES, 2019, 72 (11): : 1537 - 1545
  • [5] Implementation of Image Processing System using Handover Technique with Map Reduce Based on Big Data in the Cloud Environment
    Ali, Mehraj
    Kumar, John
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2016, 13 (02) : 326 - 331
  • [6] Big Data Analytics using Hadoop Map Reduce Framework and Data Migration Process
    Bante, Payal M.
    Rajeswari, K.
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2017,
  • [7] Securing Big Data Efficiently through Microaggregation Technique
    Tonni, Shakila Mahjabin
    Parvin, Sazia
    Rahman, Mohammad Zahidur
    Gawanmeh, Amjad
    [J]. 2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW), 2017, : 125 - 130
  • [8] Handling Big Data Using a Data-Aware HDFS and Evolutionary Clustering Technique
    Hajeer, Mustafa
    Dasgupta, Dipankar
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2019, 5 (02) : 134 - 147
  • [9] Subgroup discovery on Big Data: exhaustive methodologies using Map-Reduce
    Padillo, F.
    Luna, J. M.
    Ventura, S.
    [J]. 2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 1684 - 1691
  • [10] An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique
    Deepak Kumar
    Vijay Kumar Jha
    [J]. Distributed and Parallel Databases, 2021, 39 : 79 - 96