Big data classification using deep learning and apache spark architecture

被引：4

作者：

Brahmane, Anilkumar, V ^{[1
]}

Krishna, B. Chaitanya ^{[1
]}

机构：

[1] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Vaddeswaram, AP, India

来源：

NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 22期

关键词：

Big data; Spark framework; Deep stacked auto-encoder; Classification; Data analysis; DATA STREAMS; ALGORITHM;

D O I：

10.1007/s00521-021-06145-w

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The oddity in large information is rising step by step so that the current programming instruments faces trouble in supervision of huge information. Moreover, the pace of the irregularity information in the immense datasets is a key imperative to the exploration business. Along these lines, this paper proposes a novel method for taking care of the large information utilizing Spark structure. The proposed method experiences two stages for arranging the enormous information, which includes highlight choice and arrangement, which is acted in the underlying hubs of Spark engineering. The proposed improvement calculation is named Rider Chaotic Biography streamlining (RCBO) calculation, which is the incorporation of the Rider Optimization Algorithm (ROA) and the standard confused biogeography-based-advancement (CBBO). The proposed RCBO-profound stacked auto-encoder utilizing Spark structure successfully handles the large information for achieving powerful huge information arrangement. Here, the proposed RCBO is utilized for choosing reasonable highlights from the monstrous dataset. Besides, the profound stacked auto-encoder utilizes RCBO for preparing so as to characterize colossal datasets. In this research we focused on problem of supervision related to big information of The Cover type Data in UCI machine learning repository. The dataset describes the forest cover set data to predict the forest cover type from cartographic variables. The dataset is multivariate in nature with number of web hits 263,361. The number of instances is 581012 with 54 numbers of attributes and the task associated for the dataset is classification. The examination of the proposed RCBO-profound stacked auto-encoder-based Spark structure utilizing the UCI AI datasets uncovered that the proposed technique beat different strategies, by procuring maximal exactness of 86.71%, dice coefficient of 92.7%, affectability of 75.2% and explicitness of 95.4% separately.

引用

页码：15253 / 15266

页数：14

共 50 条

[1] Big data classification using deep learning and apache spark architecture
Anilkumar V. Brahmane
B. Chaitanya Krishna
[J]. Neural Computing and Applications, 2021, 33 : 15253 - 15266
[2] Mobile Big Data Analytics Using Deep Learning and Apache Spark
Abu Alsheikh, Mohammad
Niyato, Dusit
Lin, Shaowei
Tan, Hwee-Pink
Han, Zhu
[J]. IEEE NETWORK, 2016, 30 (03): : 22 - 29
[3] A Big Data Analysis Framework Using Apache Spark and Deep Learning
Gupta, Anand
Thakur, Hardeo Kumar
Shrivastava, Ritvik
Kumar, Pulkit
Nag, Sreyashi
[J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 9 - 16
[4] Big Data Machine Learning using Apache Spark MLlib
Assefi, Mehdi
Behravesh, Ehsun
Liu, Guangchi
Tafti, Ahmad P.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3492 - 3498
[5] Big data Predictive Analytics for Apache Spark using Machine Learning
Junaid, Muhammad
Wagan, Shiraz Ali
Qureshi, Nawab Muhammad Faseeh
Nam, Choon Sung
Shin, Dong Ryeol
[J]. 2020 GLOBAL CONFERENCE ON WIRELESS AND OPTICAL TECHNOLOGIES (GCWOT), 2020,
[6] Scalable Manifold Learning for Big Data with Apache Spark
Schoeneman, Frank
Zola, Jaroslaw
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 272 - 281
[7] Performance Analysis of Machine Learning Techniques on Big Data Using Apache Spark
Mogha, Garima
Ahlawat, Khyati
Singh, Amit Prakash
[J]. DATA SCIENCE AND ANALYTICS, 2018, 799 : 17 - 26
[8] Sentiment classification using paragraph vector and cognitive big data semantics on Apache Spark
Ravi, Kumar
Ravi, Vadlamani
Shivakrishna, B.
[J]. PROCEEDINGS OF 2018 IEEE 17TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC 2018), 2018, : 187 - 194
[9] On Scalability of Distributed Machine Learning with Big Data on Apache Spark
Hai, Ameen Abdel
Forouraghi, Babak
[J]. BIG DATA - BIGDATA 2018, 2018, 10968 : 209 - 219
[10] Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark
Hafez, Manar Mohamed
Shehab, Mohamed Elemam
El Fakharany, Essam
Hegazy, Abd El Ftah Abdel Ghfar
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 692 - 704

← 1 2 3 4 5 →