MapReduce-based storage and indexing for big health data

被引:2
|
作者
Gayathiri, N. R. [1 ]
Natarajan, A. M. [2 ]
机构
[1] Bannari Amman Inst Technol, Dept Informat Technol, Sathyamangalam 638401, India
[2] KPR Inst Engn & Technol, Dept Comp Sci & Engn, Coimbatore, Tamil Nadu, India
来源
关键词
cluster; Hadoop; HDFS; hyperplanes; LSH; MapReduce; HADOOP;
D O I
10.1002/cpe.4854
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Locality Sensitive Hashing (LSH) uses randomized method to alleviate Nearest Neighbor Search issue in high dimensional spaces. However, handling of big dataset samples for LSH algorithm becomes difficult task because of computational complexity. So, the major aim of this work is to introduce a new LSH algorithm with Hadoop MapReduce framework for enhancing proficiency of arbitrary reads over big dataset samples. The proposed Hash index improves efficiency by reducing the amount of accessing data for range queries by creating buckets based on hyperplanes. A LSH on MapReduce is developed, which decreases the random data access time among map and reduce functions, in addition, it enhances proficiency. Lastly, with the aim of validating the performance of presented index for search query in MapReduce, five performance metrics such as changing cluster size, LSH for Bucket size Balancing, the overlapped boundary of a hyperplane, Bucket creation based on the configured capacity, and non-indexed, Hash index, and global indexed dataset on the HDFS configured capacity are utilized. The effect of these metrics on dataset on the HDFS configured capacity for the period of map and reduce functions as well depicts the pre-eminence of the presented Hash index.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A MapReduce-based scalable discovery and indexing of structured big data
    Singh, Hari
    Bawa, Seema
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 73 : 32 - 43
  • [2] A MapReduce-Based ELM for Regression in Big Data
    Wu, B.
    Yan, T. H.
    Xu, X. S.
    He, B.
    Li, W. H.
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
  • [3] Atrak: a MapReduce-based data warehouse for big data
    Barkhordari, Mohammadhossein
    Niamanesh, Mahdi
    [J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
  • [4] Atrak: a MapReduce-based data warehouse for big data
    Mohammadhossein Barkhordari
    Mahdi Niamanesh
    [J]. The Journal of Supercomputing, 2017, 73 : 4596 - 4610
  • [5] A MapReduce-based Fuzzy Associative Classifier for Big Data
    Ducange, Pietro
    Marcelloni, Francesco
    Segatori, Armando
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [6] Knowledge process of health big data using MapReduce-based associative mining
    Choi, So-Young
    Chung, Kyungyong
    [J]. PERSONAL AND UBIQUITOUS COMPUTING, 2020, 24 (05) : 571 - 581
  • [7] Verifying Properties of MapReduce-Based Big Data Processing
    Zhang, Nan
    Wang, Meng
    Duan, Zhenhua
    Tian, Cong
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 321 - 338
  • [8] Knowledge process of health big data using MapReduce-based associative mining
    So-Young Choi
    Kyungyong Chung
    [J]. Personal and Ubiquitous Computing, 2020, 24 : 571 - 581
  • [9] An Accelerated MapReduce-Based K-prototypes for Big Data
    Ben HajKacem, Mohamed Aymen
    Ben N'cir, Chiheb-Eddine
    Essoussi, Nadia
    [J]. SOFTWARE TECHNOLOGIES: APPLICATIONS AND FOUNDATIONS (STAF 2016), 2016, 9946 : 13 - 25
  • [10] A MapReduce-based approach to social network big data mining
    Qi, Fuli
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547