Big data Predictive Analytics for Apache Spark using Machine Learning

被引：0

作者：

Junaid, Muhammad ^{[1
]}

Wagan, Shiraz Ali ^{[1
]}

Qureshi, Nawab Muhammad Faseeh ^{[2
]}

Nam, Choon Sung ^{[1
]}

Shin, Dong Ryeol ^{[1
]}

机构：

[1] Sungkyunkwan Univ, Elect & Comp Engn, Suwon, South Korea

[2] Sungkyunkwan Univ, Dept Comp Educ, Seoul, South Korea

来源：

2020 GLOBAL CONFERENCE ON WIRELESS AND OPTICAL TECHNOLOGIES (GCWOT) | 2020年

基金：

新加坡国家研究基金会;

关键词：

apache-spark; clusters; predictive analysis; Mllib; pandas; 5Vs of big data; PLACEMENT STRATEGY; HADOOP;

D O I：

10.1109/GCWOT49901.2020.9391620

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

In today's digital world data is producing at a rapid speed and handling this massive diverse data become more challenging. The environment of big data is capable of handling data efficiently from data warehouses and in real-time. In Big data environment, Apache Spark is cluster-based, open-source computing technology explicitly designed for bulky data handling. Apache spark services are to perform composite Analytics through in-memory processing. This plays an active role in making meaningful exploration through machine learning and processes a large amount of data. Machine learning API is known as Mllib. It is highly prominent and efficient for big data platforms also offers excellent functionalities. In this paper, we have performed an experiment to look at the analytical qualities of Mllib in the apache spark environment. Likewise, we have highlighted the modern tendencies of Machine learning in big data studies and provides an understanding of upcoming work.

引用

页数：7

共 50 条

[1] Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark
Hafez, Manar Mohamed
Shehab, Mohamed Elemam
El Fakharany, Essam
Hegazy, Abd El Ftah Abdel Ghfar
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 692 - 704
[2] An insight into tree based machine learning techniques for big data Analytics using Apache Spark
Sheshasaayee, Ananthi
Lakshmi, J. V. N.
[J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, INSTRUMENTATION AND CONTROL TECHNOLOGIES (ICICICT), 2017, : 1740 - 1743
[3] Mobile Big Data Analytics Using Deep Learning and Apache Spark
Abu Alsheikh, Mohammad
Niyato, Dusit
Lin, Shaowei
Tan, Hwee-Pink
Han, Zhu
[J]. IEEE NETWORK, 2016, 30 (03): : 22 - 29
[4] Big Data Machine Learning using Apache Spark MLlib
Assefi, Mehdi
Behravesh, Ehsun
Liu, Guangchi
Tafti, Ahmad P.
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 3492 - 3498
[5] Big data analytics on Apache Spark
Salloum S.
Dautov R.
Chen X.
Peng P.X.
Huang J.Z.
[J]. International Journal of Data Science and Analytics, 2016, 1 (3-4) : 145 - 164
[6] Big Data Software Analytics with Apache Spark
Gousios, Georgios
[J]. PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING - COMPANION (ICSE-COMPANION, 2018, : 542 - 543
[7] Big Data, Predictive Analytics and Machine Learning
Ongsulee, Pariwat
Chotchaung, Veena
Bamrungsi, Eak
Rodcheewit, Thanaporn
[J]. 2018 16TH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2018, : 37 - 42
[8] Performance Analysis of Machine Learning Techniques on Big Data Using Apache Spark
Mogha, Garima
Ahlawat, Khyati
Singh, Amit Prakash
[J]. DATA SCIENCE AND ANALYTICS, 2018, 799 : 17 - 26
[9] On Scalability of Distributed Machine Learning with Big Data on Apache Spark
Hai, Ameen Abdel
Forouraghi, Babak
[J]. BIG DATA - BIGDATA 2018, 2018, 10968 : 209 - 219
[10] Sehaa: A Big Data Analytics Tool for Healthcare Symptoms and Diseases Detection Using Twitter, Apache Spark, and Machine Learning
Alotaibi, Shoayee
Mehmood, Rashid
Katib, Iyad
Rana, Omer
Albeshri, Aiiad
[J]. APPLIED SCIENCES-BASEL, 2020, 10 (04):

← 1 2 3 4 5 →