MapReduce-based storage and indexing for big health data

被引：2

作者：

Gayathiri, N. R. ^{[1
]}

Natarajan, A. M. ^{[2
]}

机构：

[1] Bannari Amman Inst Technol, Dept Informat Technol, Sathyamangalam 638401, India

[2] KPR Inst Engn & Technol, Dept Comp Sci & Engn, Coimbatore, Tamil Nadu, India

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2019年 / 31卷 / 14期

关键词：

cluster; Hadoop; HDFS; hyperplanes; LSH; MapReduce; HADOOP;

D O I：

10.1002/cpe.4854

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Locality Sensitive Hashing (LSH) uses randomized method to alleviate Nearest Neighbor Search issue in high dimensional spaces. However, handling of big dataset samples for LSH algorithm becomes difficult task because of computational complexity. So, the major aim of this work is to introduce a new LSH algorithm with Hadoop MapReduce framework for enhancing proficiency of arbitrary reads over big dataset samples. The proposed Hash index improves efficiency by reducing the amount of accessing data for range queries by creating buckets based on hyperplanes. A LSH on MapReduce is developed, which decreases the random data access time among map and reduce functions, in addition, it enhances proficiency. Lastly, with the aim of validating the performance of presented index for search query in MapReduce, five performance metrics such as changing cluster size, LSH for Bucket size Balancing, the overlapped boundary of a hyperplane, Bucket creation based on the configured capacity, and non-indexed, Hash index, and global indexed dataset on the HDFS configured capacity are utilized. The effect of these metrics on dataset on the HDFS configured capacity for the period of map and reduce functions as well depicts the pre-eminence of the presented Hash index.

引用

页数：11

共 50 条

[1] A MapReduce-based scalable discovery and indexing of structured big data
Singh, Hari
Bawa, Seema
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 73 : 32 - 43
[2] A MapReduce-Based ELM for Regression in Big Data
Wu, B.
Yan, T. H.
Xu, X. S.
He, B.
Li, W. H.
[J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
[3] Atrak: a MapReduce-based data warehouse for big data
Barkhordari, Mohammadhossein
Niamanesh, Mahdi
[J]. JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
[4] Atrak: a MapReduce-based data warehouse for big data
Mohammadhossein Barkhordari
Mahdi Niamanesh
[J]. The Journal of Supercomputing, 2017, 73 : 4596 - 4610
[5] A MapReduce-based Fuzzy Associative Classifier for Big Data
Ducange, Pietro
Marcelloni, Francesco
Segatori, Armando
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
[6] Knowledge process of health big data using MapReduce-based associative mining
Choi, So-Young
Chung, Kyungyong
[J]. PERSONAL AND UBIQUITOUS COMPUTING, 2020, 24 (05) : 571 - 581
[7] Verifying Properties of MapReduce-Based Big Data Processing
Zhang, Nan
Wang, Meng
Duan, Zhenhua
Tian, Cong
[J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 321 - 338
[8] Knowledge process of health big data using MapReduce-based associative mining
So-Young Choi
Kyungyong Chung
[J]. Personal and Ubiquitous Computing, 2020, 24 : 571 - 581
[9] An Accelerated MapReduce-Based K-prototypes for Big Data
Ben HajKacem, Mohamed Aymen
Ben N'cir, Chiheb-Eddine
Essoussi, Nadia
[J]. SOFTWARE TECHNOLOGIES: APPLICATIONS AND FOUNDATIONS (STAF 2016), 2016, 9946 : 13 - 25
[10] A MapReduce-based approach to social network big data mining
Qi, Fuli
[J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547

← 1 2 3 4 5 →