Processing Big Data with Apache Hadoop in the Current Challenging Era of COVID-19

被引:15
|
作者
Azeroual, Otmane [1 ]
Fabre, Renaud [2 ]
机构
[1] German Ctr Higher Educ Res & Sci Studies DZHW, D-10117 Berlin, Germany
[2] Univ Paris 08, Dionysian Econ Lab LED, F-93200 St Denis, France
关键词
big data; data processing; unstructured data; large amounts of data; COVID-19; challenges; Hadoop technology; MapReduce; WordCount; ANALYTICS;
D O I
10.3390/bdcc5010012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Big data have become a global strategic issue, as increasingly large amounts of unstructured data challenge the IT infrastructure of global organizations and threaten their capacity for strategic forecasting. As experienced in former massive information issues, big data technologies, such as Hadoop, should efficiently tackle the incoming large amounts of data and provide organizations with relevant processed information that was formerly neither visible nor manageable. After having briefly recalled the strategic advantages of big data solutions in the introductory remarks, in the first part of this paper, we focus on the advantages of big data solutions in the currently difficult time of the COVID-19 pandemic. We characterize it as an endemic heterogeneous data context; we then outline the advantages of technologies such as Hadoop and its IT suitability in this context. In the second part, we identify two specific advantages of Hadoop solutions, globality combined with flexibility, and we notice that they are at work with a "Hadoop Fusion Approach" that we describe as an optimal response to the context. In the third part, we justify selected qualifications of globality and flexibility by the fact that Hadoop solutions enable comparable returns in opposite contexts of models of partial submodels and of models of final exact systems. In part four, we remark that in both these opposite contexts, Hadoop's solutions allow a large range of needs to be fulfilled, which fits with requirements previously identified as the current heterogeneous data structure of COVID-19 information. In the final part, we propose a framework of strategic data processing conditions. To the best of our knowledge, they appear to be the most suitable to overcome COVID-19 massive information challenges.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Processing of Big Educational Data in the Cloud Using Apache Hadoop
    Machova, Renata
    Komarkova, Jitka
    Lnenicka, Martin
    INTERNATIONAL CONFERENCE ON INFORMATION SOCIETY (I-SOCIETY 2016), 2016, : 46 - 49
  • [2] A proposal of a big web data application and archive for the distributed data processing with Apache Hadoop
    University of Pardubice, Faculty of Economics and Administration, Pardubice, Czech Republic
    Lect. Notes Comput. Sci., (285-294):
  • [3] A Proposal of a Big Web Data Application and Archive for the Distributed Data Processing with Apache Hadoop
    Lnenicka, Martin
    Hovad, Jan
    Komarkova, Jitka
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT II, 2015, 9330 : 285 - 294
  • [4] Processing LIDAR Data with Apache Hadoop
    Ruzicka, Jan
    Orcik, Lukas
    Ruzickova, Katerina
    Kisztner, Juraj
    RISE OF BIG SPATIAL DATA, 2017, : 351 - 358
  • [5] Big Data Analysis using Apache Hadoop
    Manikandan, Shankar Ganesh
    Ravi, Siddarth
    2014 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2014,
  • [6] Social license for the use of big data in the COVID-19 era
    James A. Shaw
    Nayha Sethi
    Christine K. Cassel
    npj Digital Medicine, 3
  • [7] Social license for the use of big data in the COVID-19 era
    Shaw, James A.
    Sethi, Nayha
    Cassel, Christine K.
    NPJ DIGITAL MEDICINE, 2020, 3 (01)
  • [8] Shared Disk Big Data Analytics with Apache Hadoop
    Mukherjee, Anirban
    Datta, Joydip
    Jorapur, Raghavendra
    Singhvi, Ravi
    Haloi, Saurav
    Akram, Wasim
    2012 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2012,
  • [9] Theoretical and Empirical Comparison of Big Data Image Processing with Apache Hadoop and Sun Grid Engine
    Bao, Shunxing
    Weitendorf, Frederick D.
    Plassard, Andrew J.
    Huo, Yuankai
    Gokhale, Aniruddha
    Landman, Bennett A.
    MEDICAL IMAGING 2017: IMAGING INFORMATICS FOR HEALTHCARE, RESEARCH, AND APPLICATIONS, 2017, 10138
  • [10] Optimization of Multiple Queries for Big Data with Apache Hadoop/Hive
    Garg, Varun
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 938 - 941