Evaluation of Machine Learning Frameworks on Bank Marketing and Higgs Datasets

被引:4
|
作者
Shashidhara, Bhuvan M. [1 ]
Jain, Siddharth [1 ]
Rao, Vinay D. [1 ]
Patil, Nagamma [1 ]
Raghavendra, G. S. [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Surathkal, India
关键词
Machine Learning Algorithms; Big Data; Parallel Execution; Distributed Computing; WEKA; Scikit-Learn; Apache Spark;
D O I
10.1109/ICACCE.2015.31
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Big data is an emerging field with different datasets of various sizes are being analyzed for potential applications. In parallel, many frameworks are being introduced where these datasets can be fed into machine learning algorithms. Though some experiments have been done to compare different machine learning algorithms on different data, these experiments have not been tested out on different platforms. Our research aims to compare two selected machine learning algorithms on data sets of different sizes deployed on different platforms like Weka, Scikit-Learn and Apache Spark. They are evaluated based on Training time, Accuracy and Root mean squared error. This comparison helps us to decide what platform is best suited to work while applying computationally expensive selected machine learning algorithms on a particular size of data. Experiments suggested that Scikit-Learn would be optimal on data which can fit into memory. While working with huge, data Apache Spark would be optimal as it performs parallel computations by distributing the data over a cluster. Hence this study concludes that spark platform which has growing support for parallel implementation of machine learning algorithms could be optimal to analyze big data.
引用
收藏
页码:551 / 555
页数:5
相关论文
共 50 条
  • [1] An Evaluation of Machine Learning Frameworks
    Wafo, Franck
    Mabou, Ivan Cedric
    Heilmann, Dan
    Zengeler, Nico
    Handmann, Uwe
    PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021), 2021, : 1411 - 1416
  • [2] Machine Learning Metrics for Network Datasets Evaluation
    Soukup, Dominik
    Uhricek, Daniel
    Vasata, Daniel
    Cejka, Tomas
    ICT SYSTEMS SECURITY AND PRIVACY PROTECTION, IFIP SEC 2023, 2024, 679 : 307 - 320
  • [3] A Comparison of Machine Learning Techniques for Classification in Bank Marketing Data
    Saengthongrattanachot, Waritpon
    Na-udom, Anamai
    Rungrattanaubol, Jaratsri
    THAI JOURNAL OF MATHEMATICS, 2022, : 157 - 168
  • [4] Bank Direct Marketing Analysis of Asymmetric Information Based on Machine Learning
    Ruangthong, Pumitara
    Jaiyen, Saichon
    PROCEEDINGS OF THE 2015 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2015, : 93 - 96
  • [5] Kernel Centric Machine Learning Classifiers for Anomaly Detection with Real Bank Datasets
    Jidiga, Goverdhan Reddy
    Porika, Sammulal
    2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
  • [6] Performance Evaluation of Machine Learning Frameworks for Aphasia Assessment
    Mahmoud, Seedahmed S.
    Kumar, Akshay
    Li, Youcun
    Tang, Yiting
    Fang, Qiang
    SENSORS, 2021, 21 (08)
  • [7] The Higgs Machine Learning Challenge
    Adam-Bourdarios, C.
    Cowan, G.
    Germain-Renaud, C.
    Guyon, I.
    Kegl, B.
    Rousseau, D.
    21ST INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP2015), PARTS 1-9, 2015, 664
  • [8] Machine Learning in Marketing
    Brei, Vinicius Andrade
    FOUNDATIONS AND TRENDS IN MARKETING, 2020, 14 (03): : 173 - 236
  • [9] AutoMLBench: A comprehensive experimental evaluation of automated machine learning frameworks
    Eldeeb, Hassan
    Maher, Mohamed
    Elshawi, Radwa
    Sakr, Sherif
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 243
  • [10] Evaluation of Machine Learning Algorithms for the Detection of Fake Bank Currency
    Yadav, Anju
    Jain, Tarun
    Verma, Vivek Kumar
    Pal, Vipin
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 810 - 815