Performance Analysis of Machine Learning Techniques on Big Data Using Apache Spark

被引:3
|
作者
Mogha, Garima [1 ]
Ahlawat, Khyati [1 ]
Singh, Amit Prakash [1 ]
机构
[1] Guru Gobind Singh Indraprastha Univ, Univ Sch Informat Commun & Technol, New Delhi, India
来源
DATA SCIENCE AND ANALYTICS | 2018年 / 799卷
关键词
Big data; Apache spark; Machine learning; Apache hadoop; DATA ANALYTICS; CLASSIFICATION;
D O I
10.1007/978-981-10-8527-7_2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Applying Intelligence to the machines is a need in today's world and this need leads to the evolution of machine learning. The analysis of data using machine learning algorithms is a trending research area and this analysis lead to some problems when the data comes out to be big data. This paper compares various classification based machine learning algorithms namely, Decision Tree Learning, Naive Bayes, Random Forest and Support Vector Machines on big data using Apache Spark. The accuracy is evaluated to find out which classification based algorithm gives fast and better result.
引用
收藏
页码:17 / 26
页数:10
相关论文
共 50 条
  • [21] Performance evaluation of intrusion detection based on machine learning using Apache Spark
    Belouch, Mustapha
    El Hadaj, Salah
    Idhammad, Mohamed
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS2017), 2018, 127 : 1 - 6
  • [22] Sehaa: A Big Data Analytics Tool for Healthcare Symptoms and Diseases Detection Using Twitter, Apache Spark, and Machine Learning
    Alotaibi, Shoayee
    Mehmood, Rashid
    Katib, Iyad
    Rana, Omer
    Albeshri, Aiiad
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (04):
  • [23] Performance Analysis of ECG Big Data using Apache Hive and Apache Pig
    Ahmad, Mudassar
    Kanwal, Safina
    Cheema, Maryam
    Habib, Muhammad Asif
    [J]. 2019 8TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICICT 2019), 2019, : 2 - 7
  • [24] Big data analytics on Apache Spark
    Salloum S.
    Dautov R.
    Chen X.
    Peng P.X.
    Huang J.Z.
    [J]. International Journal of Data Science and Analytics, 2016, 1 (3-4) : 145 - 164
  • [25] Big Data Analytics using Machine Learning Techniques
    Mittal, Shweta
    Sangwan, Om Prakash
    [J]. 2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 203 - 207
  • [26] Big Data Network Flow Processing Using Apache Spark
    Jerabek, Kamil
    Rysavy, Ondrej
    [J]. PROCEEDINGS OF THE 6TH CONFERENCE ON THE ENGINEERING OF COMPUTER BASED SYSTEMS (ECBS 2019), 2020,
  • [27] MLlib: Machine learning in Apache Spark
    Meng, Xiangrui
    Bradley, Joseph
    Yavuz, Burak
    Sparks, Evan
    Venkataraman, Shivaram
    Liu, Davies
    Freeman, Jeremy
    Tsai, D.B.
    Amde, Manish
    Owen, Sean
    Xin, Doris
    Xin, Reynold
    Franklin, Michael J.
    Zadeh, Reza
    Zaharia, Matei
    Talwalkar, Ameet
    [J]. Journal of Machine Learning Research, 2016, 17
  • [28] MLlib: Machine Learning in Apache Spark
    Meng, Xiangrui
    Bradley, Joseph
    Yavuz, Burak
    Sparks, Evan
    Venkataraman, Shivaram
    Liu, Davies
    Freeman, Jeremy
    Tsai, D. B.
    Amde, Manish
    Owen, Sean
    Xin, Doris
    Xin, Reynold
    Franklin, Michael J.
    Zadeh, Reza
    Zaharia, Matei
    Talwalkar, Ameet
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [29] Road Traffic Event Detection Using Twitter Data, Machine Learning, and Apache Spark
    Alomari, Ebtesam
    Mehmood, Rashid
    Katib, Iyad
    [J]. 2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 1888 - 1895
  • [30] Big Spatial Data Processing With Apache Spark
    Boyi Shangguan
    Peng Yue
    Wu, Zhaoyan
    Jiang, Liangcun
    [J]. 2017 6TH INTERNATIONAL CONFERENCE ON AGRO-GEOINFORMATICS, 2017, : 239 - 242