Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce

被引:95
|
作者
Ramirez-Gallego, Sergio [1 ]
Fernandez, Alberto [1 ]
Garcia, Salvador [1 ]
Chen, Min [2 ]
Herrera, Francisco [1 ]
机构
[1] Univ Granada, Dept Comp Sci & Artificial Intelligence, Granada, Spain
[2] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Hubei, Peoples R China
关键词
Big Data Analytics; MapReduce; Information fusion; Spark; Machine learning; BUSINESS INTELLIGENCE; SYSTEMS; INSIGHT;
D O I
10.1016/j.inffus.2017.10.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We live in a world were data are generated from a myriad of sources, and it is really cheap to collect and storage such data. However, the real benefit is not related to the data itself, but with the algorithms that are capable of processing such data in a tolerable elapse time, and to extract valuable knowledge from it. Therefore, the use of Big Data Analytics tools provide very significant advantages to both industry and academia. The MapReduce programming framework can be stressed as the main paradigm related with such tools. It is mainly identified by carrying out a distributed execution for the sake of providing a high degree of scalability, together with a fault tolerant scheme. In every MapReduce algorithm, first local models are learned with a subset of the original data within the so-called Map tasks. Then, the Reduce task is devoted to fuse the partial outputs generated by each Map. The ways of designing such fusion of information/models may have a strong impact in the quality of the final system. In this work, we will enumerate and analyze two alternative methodologies that may be found both in the specialized literature and in standard Machine Learning libraries for Big Data. Our main objective is to provide an introduction of the characteristics of these methodologies, as well as giving some guidelines for the design of novel algorithms in this field of research. Finally, a short experimental study will allow us to contrast the scalability issues for each type of process fusion in MapReduce for Big Data Analytics.
引用
收藏
页码:51 / 61
页数:11
相关论文
共 50 条
  • [1] On using MapReduce to scale algorithms for Big Data analytics: a case study
    Kijsanayothin, Phongphun
    Chalumporn, Gantaphon
    Hewett, Rattikorn
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [2] On using MapReduce to scale algorithms for Big Data analytics: a case study
    Phongphun Kijsanayothin
    Gantaphon Chalumporn
    Rattikorn Hewett
    [J]. Journal of Big Data, 6
  • [3] Tutorial on big spectrum data analytics for space information networks
    Guoru Ding
    Lin Li
    Juzhen Wang
    Yumeng Wang
    Lei Chen
    [J]. EURASIP Journal on Wireless Communications and Networking, 2018
  • [4] Tutorial on big spectrum data analytics for space information networks
    Ding, Guoru
    Li, Lin
    Wang, Juzhen
    Wang, Yumeng
    Chen, Lei
    [J]. EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
  • [5] Big Data Analytics based on PANFIS MapReduce
    Za'in, Choiru
    Pratama, Mahardhika
    Lughofer, Edwin
    Ferdaus, Meftahul
    Cai, Qing
    Prasad, Mukesh
    [J]. INNS CONFERENCE ON BIG DATA AND DEEP LEARNING, 2018, 144 : 140 - 152
  • [6] MapReduce Algorithms for Big Data Analysis
    Shim, Kyuseok
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2016 - 2017
  • [7] MapReduce Algorithms for Big Data Analysis
    Shim, Kyuseok
    [J]. DATABASES THEORY AND APPLICATIONS, ADC 2018, 2018, 10837 : XV - XV
  • [8] Tutorial on Benchmarking Big Data Analytics Systems
    Ivanov, Todor
    Singhal, Rekha
    [J]. ICPE'20: COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, 2020, : 50 - 53
  • [9] Perceptual Reasoning Managed Big Data Analytics and Information Fusion
    Kadar, Ivan
    [J]. SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION XXII, 2013, 8745
  • [10] Perceptual Reasoning Managed Big Data Analytics and Information Fusion
    Kadar, Ivan
    [J]. SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION XXII, 2013, 8745